An Exploration of Agent Mode in Android Studio

For the first installment of this “Can I write an Android app entirely with AI” project, I had a broken arm. Which meant that I did the coding first and then wrote the blog post afterwards, knowing the results. Now, I’m pretty much healed, so for this second installment, I started writing an outline for the blog post, expecting to build on it as I coded. I had three tasks set out as the goals: two smaller tasks, similar to what has already been completed, and one significant task that would require more design work.

Dear reader, I made it through one task. So my elegantly crafted introduction that I already wrote is useless and now we’re getting whatever this is.

My goal, at least, was still accomplished, and that was to investigate the Agent Mode of Gemini in Android Studio, which was released sometime in August. This is a spoiler for the conclusion of this post, but I didn’t get far before becoming frustrated with the absolute garbage that Agent Mode would generate.

Same as last time, the resulting code has been posted on Github. Let’s get into it!

Prompt: gemini agent mode

What is Agent Mode?

Agent Mode, or Agentic AI as it’s called elsewhere, is the idea that instead of asking your AI tool questions, you can use the tool as an agent, where it operates on your behalf to perform some sort of task. An example I’ve seen used a lot is, something like ChatGPT can answer “what’s the best hotel in Tokyo?”, where an AI operating as an agent would be able to reserve you a room at the best hotel in Tokyo. One is answering, the other is doing. This article from IBM goes into depth about the differences, benefits, and pitfalls of agentic AI.

For coding tools, it generally means that instead of merely answering questions about code, the AI tool can actually generate code, insert it, save it, delete existing code, etc. It sounds really appealing to me as a developer, in that I could offload some of the more tedious aspects of physically writing code and focus on the larger picture of what I’m creating. It sounds like an excellent tool for this experiment, as instead of copy and pasting the AI answer myself, the AI can just do that too.

Agent Mode in Practice

When you open the Agent Mode tab in Android Studio, you’re prompted with some examples of how to use the tool.

These prompts are a little more technical than what I’ve been using for building this app, but I decided to just try a more basic prompt anyway.

I’m not entirely sure why it started looking for UserPreferences? The PackingList data class is defined in MainActivity.kt, because like half the code is defined in MainActivity at this point. So I provided MainActivity as the response to its request and it gave me some code which I blindly approved.

And, that code was awful. From both a design perspective and from a functional perspective.

It made each title a permanent edit text? And then the state management was completely busted, the text would overwrite itself as I was typing. Really nowhere close to an experience that I want in an application.

At this point, I’m not feeling Agent Mode. I find myself pining for the regular chat, which had been fairly successful at generating functional UI up to this point. So I revert everything Agent Mode just did and ask the same question to just regular Gemini chat.

And it was much more successful. Only two compilation errors and then everything just worked. It added an edit button to the title bar to edit the list name. A little clunky looking, in my opinion, but infinitely better than what Agent Mode came up with.

Am I surprised at this? Kind of, but not really. The prompts supplied really suggest giving Agent Mode less open ended statements. So I decide to change direction a little bit and ask Agent Mode to do something more straightforward.

Can Agent Mode Write Unit Tests?

I kind of hate unit tests. Like, not that I hate having tested code, but I find them tedious to write in most cases. Tedious, formulaic code sounds like a perfect task for AI.

Well, that’s a great start. I’m glad it took the stock unit test provided when you create a new project and decided that I was beyond helping. I did what it suggested, however, and prompted it to test the PackingListViewModel, as that seemed like the only file that wasn’t a disaster of multiple classes. Essentially, tossing Gemini a softball to see if it could finally get a hit.

And it seemed like it made some progress, it created a new file and some tests. But then I ran the tests and only three of the tests pass, including the example test which is just adding two numbers. Wonderful.

I put on my actual software engineering hat and it looks like perhaps the tests might not be updating the fake datastore properly? This is the part of unit testing I hate, figuring out the specifics of a unit testing framework, so I am thrilled to have to be engaging here.

I now ask it to fix the five tests that are failing and the request times out. There is no way to refresh or trigger the same question after a time out, so I ask again. Now the tests don’t even compile because it has hallucinated an entire “Item” class.

@Testfun `updatePackingListsOnly updates only the packing lists`() = runTest {    
  val originalItem = Item(id = 1, name = "Sunscreen", isPacked = false) // Assuming Item structure

Why is it assuming the Item structure? It has access to the code, it should be able to see that there is no Item class!

At this point my husband, also a computer engineer, looks over my shoulder and asks if the Agent Mode can run the tests itself, because the other AI tools he’s used just automatically run the tests without prompting.

❝

Can you run the tests you just fixed?

It seems the tests are failing because the Item data class is not recognized in the PackingListViewModelTest.kt file. This usually means that Item is either not defined, not imported correctly, or not accessible from the test source set.

Could you please show me the definition of the Item data class and tell me which file it's in? It's likely in MainActivity.kt along with PackingList.

Once I see how Item is defined, I can adjust the test file to correctly reference it, or if necessary, add a mock definition within the test file for the tests to compile and run.

ARE YOU SERIOUS YOU JUST HALLUCINATED THIS. YOU HAVE ACCESS TO MY CODE! Why are you asking me to see the definition, what are you even parsing??

Giphy

Okay, I took a brief break to prevent myself from crashing out over an AI tool that I am using “for fun”. I’m going to do what I did before and revert all this crap and ask the regular chat mode for help.

And that mostly worked, besides one compliation error. It seems like both versions of AI do not like the SelectableItem class, as this is the second time it’s hallucinated what the fields are named. I’m curious if that’s because there’s a lot of other random things in the actual SelectableItem file (like the SelectableItemScreen and the view for SelectableItem). I feel like this would be a great use case for Agent Mode, if it worked - to refactor the existing code into new files.

❝

However, in the PackingListViewModelTest.kt file I generated, I mistakenly used the property name name when creating instances of SelectableItem for testing.

At least Gemini is owning its mistakes this time. After being faced with the error it generated, it conjured up a fix and now all tests are passing!

Can Agent Mode do literally the bare minimum?

Okay, I’m going to give Agent Mode one more chance, based on my spurious thought above. I’m going to see if it can refactor code that is already written into separate files. Nothing new needs to be added, just new files and copy/pasting/deleting code that exists and works. Given the current track record I am extremely skeptical that this will be successful. But if unit tests were the softball throw, then refactoring existing code is the T-Ball.

It looks like it was successful. It broke out the various classes and components that were in MainActivity.kt and SelectableItem.kt into their own files. Congratulations, Agent Mode, you copied, pasted, and deleted code, something I’m pretty sure I could teach a seven year old how to do.

Conclusion

Gemini Agent Mode is bad. It’s not good.

I was pretty impressed by Gemini chat mode last time, and I still am today. There’s some things it gets wrong, but most of the time I can just copy what it generates and come up with something workable. Agent mode seemed like an easy win to add to the workflow, but it failed.

And I’m not quite sure why it failed. I don’t know why the code that Gemini chat comes up with is so much better than Agent Mode. If I’m asking the same question to each, why is it different? They both (in theory) have access to the same code. I’m guessing the AI models are different behind the scenes, but it still doesn’t make sense to me how there can be this stark of a difference between the two.

It does warn me when I open the Agent Mode tab that it’s currently “in Preview”. So I guess what I’ve used isn’t considered the final form and will hopefully improve as time goes on. But the Android Studio documentation page literally suggests using Agent Mode to write unit tests. I’m not going out of bounds of what the tool is designed for.

Lies!!!!

I would not recommend Gemini Agent Mode in its current iteration. It is far faster to ask the same question to chat mode and copy and paste the answer than to correct whatever Agent Mode inserts. It’s also far less frustrating.

I still want to continue this exploration into AI tooling, to keep pushing to see where AI begins to falter. I still have those two tasks that I wanted to fit into this blog post, after all. However, I will be shelving Gemini Agent Mode until there are significant updates.

If you want to see where this experiment goes, subscribe and you’ll get all the updates in your inbox!