Post

How AI Saved my Trash Coding and Research Project

My Experience with Large Language Models in an Academic Setting and how I accidentally became an expert on AI visual identification tools for conservation.

📬 Subscribe to my newsletter Subscribe
How AI Saved my Trash Coding and Research Project

My Experience

I recently finished my paper for a research project I started during my mid years in college. Funnily enough, I found the topic when trying to find a way to automate my own job of animal identification for biological sciences for Ella DiPetto. The job was simple, click through 1000s of images and try to spot an animal on the image. Needless to say, I did not stay on the job for long because I value new experiences and gaining knowledge (even though it paid decently for basically listening to music. You live and learn).

I am primarily in the business of finance/accounting and have had a heavy interest in computers for my entire time growing up. However, at-least during that time I was much more on the side of “computing” than “coding”. Before the prevalence of language models, it did not seem like a topic that was very approachable without some form of academic/class setting to learn. And for some reason younger me seeing all the 0s and 1s flying across the screen thought “real coding” was binary analysis and Assembly. Even though that’s technically true, there is value to the layers of abstraction you put on computing for achieving as specific task even though some fundamental understanding is lost.

In a research setting, there is identification of a problem domain or discovery, defining your output, setup of design, collaboration, and most importantly reproducibility. I discovered my “problem” when I understood that the current generation of models could not be used due to need for exact data and that humans, at-least for now, are uniquely primed for animal identification in environments. The entire dataset, which composed of some millions of images was reviewed once initially then 50% randomly the second time. Overtime, the problem became- what was the limitations of these visual identification models? For DiPetto’s environments specifically it was animals being missed or random objects being identified as animals.

However, I also decided to do some analysis on the mathematics of optimization of the image identification model. Needless to say, none of it worked and you should train neural networks by using training data and then optimizing it with proper machine learning techniques.

I did not know how to do that at the time. What I did know was the specific question I wanted to ask out of the data that I had. Before that even began, though, I had to understand how to get there for the values that I wanted. I spent a solid three days understanding how models were benchmarked (with hits and misses) and how I wanted to achieve that with the data that I had. Once I had that understanding, I then moved on to the data manipulation.

One problem, I barely knew how to code. Some HTML here, some CSS there, but nothing to the extent of actually making a logical structure of my thought.

All the way back in my softmore/junior (2018-2019) year of highschool I was putting out feelers for a senior project. One project that came up was helping a local nonprofit with some server setup involving SQL. I eventually moved to another idea because I found a better project but was thinking in my head “man, what if there was a way to plain english these SQL queries to make the output easier”.

During 2023 and back in college Mr.gippity (Chat GPT-4) was slowly becoming popular and started to get on my horizon. I had heard about GPT-3 and touched it once or twice but originally wrote it off as a fancy storyteller and not much else.

I started asking my questions on stack overflow - “Implementation of Cobb-Douglas Utility Function to calculate Receiver Operator Curve & AUC” and by the saving grace of user ‘L Tyrone’ I was given a general guideline and solution to build off of.

Until one day I realized you could ask LLM models how to code. Everything clicked. It wasn’t an immediate click as you still need good prompting to make things work but it was the accelerant I needed to continue my project. I was able to translate my user requirements into something tangible for data manipulation. I still looked at the input, checked the code, and checked the output. I was trying to publish a paper after all. Overtime, I slowly started to reach the marks of computing performance limitations, memory management, parallelism, and upgrading to my college’s high performance computing cluster for more CPU and GPU processing power. Each was discovered by my inexperience in knowing what limitations I had.

“Huh, why is my computer freezing and at 100% ram usage?” “Oh, I’m going to either need to wait 2 years or learn how to make things faster”

Through this entire process I knew exactly what I needed to do with my data and had a fundamental understanding of confusion matrixes, Receiver-Operating-Curve, Previsions Recall Curve, and Area Under the Curve values per the trapezoidal rule and all other mathematics that needed to be coded with my data. I also had an understanding of why I was performing this specific analysis and the ways that I needed to clean my data.

Sure, there’s nothing stopping me now from creating a few scripts of unzipping some JSON files, importing them, making some mathematical computations, and graphing the output with an AI language model. But I would have missed out on two very important factors:

1) The learning involved in the process 2) Was the process correct for the data analysis I wanted? Was the question I am asking correct in the first place? I still needed to have the right questions to approach the problem. I put this point because my paper was an entire representation of overthinking it the wrong way.

I had a process mapped inside my mind and used GPT-4 to piecemeal those processes into a useable function in R studio. Through this project and utilizing relatively “low power” models compared to the latest generation, I realize now that this understanding of how to use a tool is fundamental to enhancing your productivity rather than replacing it. I do not care how my graph is made, I care why it is made and what it references. In software development terms - does it pass my test for acceptability?

To some extent during the Summer of 2024, I experienced what Steve Yeggie mentioned of “AI fatigue”. The project was the only thing I was mainly focused on during the summer. I was able to iterate, manipulate, experiment, verify, diagnose, debug, and develop at a speed that would not have been possible sifting through structured documentation and stack overflow threads. Once I complete a step with LLM acceptance it was then my cognitive load and responsibility to determine what to do next. I also realized that by not being specifically goal oriented on what final output I wanted I had a lot of time for experimentation. While it was a good learning experience it wasted time in relation to the project I wanted to complete.

What did using the tool teach me and what are my thoughts?

Now let’s go to accounting, bear with me. The dryest topic you will ever study in an accounting course will be audit. I know you might be asking, “how can accounting get more boring?” It can. It is like the physical and mental discipline of watching paint dry in a classroom and reading a textbook. The actual practice of audit is drastically more fun in comparison. Within it, there is a very important concept of Internal Controls. Internal Controls are the ability of a company to control why and what a specific process is happening within their organization. Without proper internal controls, there are avenues for errors and fraud to occur.

This is especially important in an academic discipline. Even with my small project and my project being under 1000 lines of code I still think it needs to be refactored to some extent to be defined as clean and readable. The technical debt that I accumulated from structuring my project was in part due to my inexperience and another part caused by the use of chat GPT-4. However, I was able to pick up the project after a year of inactivity because of the effort I put in to make it understandable and working for myself in the future such as with over commenting on the entire code base to make it literally readable like a book for myself.

We are slowly reaching a point in time in which the documentation, reproducibility, and quantifiability is more important than ever. And until AI models get more advanced, they are not up to that. I succinctly remember during my time as a tax intern there was one Manager that was fighting with with Miscrosoft Copilot and kept correcting it and saying “no that’s wrong”. I find it kind of funny that they didn’t understand this and were basically prompting an auto-complete that has no learning functionality at that point in time (Summer 2025) about tax code they had more of a subject matter expertise on. Even though tax is one of the most primed positions for AI/LLM utilization because it is interpreting the letter of the law to financial treatment which has some but usually low wiggle room. But again, the systems and tax software on which these things are executed on usually have defined inputs which are optimal for an automation task.

I find it funny now with the implementation of ‘skills’ which are reusable capabilities for AI agents they have some of the most in depth documentation on how a certain process is defined and executed. The way I think about how these tools are used are deterministic probability engines. e.g. “what is the next best step given a certain input”? While self-correction and other capabilities are currently being improved upon at the end of the day it is an extremely advanced tool for YOU to use until things get advanced enough. I’ll end this post with a short conversation I had with my friend and his responses that I share a sentiment.

Me: What do you think will cause an intelligence boom?

What i think would cause an intelligence boom is finding a way to get AI so accurate that they’re effectively never wrong, or if they don’t know something, have the agency to figure it out accurately.

Another problem is hardware and infrastructure. We need next generation hardware to hold our new models.

AI also needs to unlock all types of creativity. They can combine ideas and they can modify existing ideas but they’re bad with mostly new ideas or unique solutions to problems. Its arguable no solution is unique but there’s at least a gradient of how unique a solution is, and AI sits near the bottom of it.

The last thing we need is for AI to work more efficiently with less data. Right now we practically have to feed the whole internet to something for it to be somewhat intelligent. We’re running out of usable data.

These are the reasons its platueing, that and infrastructure takes years to scale

Me: Very interesting take. I really wonder what line of thinking it will take to get there (if at all). However, for me personally I’m on the same page as you. These marginal improvements are little pieces of chocolate to munch on. Overall, I don’t know how they’ll shift everything but the here-and-now application are very nice. What do you think the next step is?  I think the real “progress” will come with smart implementing and integrating of existing use cases, Similar to the introduction Excel. I can’t really see an argument for groundbreaking just yet in my mind.

Honestly the next step is the bubble collapsing, like the dot com bubble, a pessimistic outlook on AI, then as more uses are found and the technology improves slowly, just like the internet we will slowly approach the singularity without anyone noticing

I think we are repeating history. There were so many dumb websites during the dot Com bubble that got billions, just like right now. We are throwing AI at every single problem just like we were making websites to show you the date and throwing billions at them. We have AI models that only make images that are somehow fortune 500.

So the actual use cases and the real giants wont go away, I mean, openAI will crumble.

Where its actually useful like github copilot or whatever will stick around, keep getting funding, and keep getting better. Just slowly.

Any company that’s tried completely replacing their main labor force has failed except for the most simple tasks. It basically replaces outsourcing code dev to India or repetitive tasks we’ve had people doing.

Data is what you make of it and action is the key to that happening.

This post is licensed under CC BY 4.0 by the author.