Now that we have completed our first ST 558 project, it is time to reflect a bit on the journey to get here.

General Process

For the project, I started off by just exploring around the API a bit, and getting a feel for what data I could pull that would be interesting to analyze. Once I had a pretty decent idea of what I wanted to do, I went ahead and pulled the data manually for each endpoint that I wanted. Then, once I had done it all manually, I went ahead and generalized the process to the functions. I learned that it is always much easier to start off outside of a function, and then generalize the code to work in a function.

For the data I analyzed, I looked at stock prices and growth rates of Apple, Google, Microsoft, and Amazon all throughout 2021. The most interesting thing I found was that Amazon was clearly the worst performer in terms of growth, while the other 3 all looked very solid. Microsoft’s growth seemed incredibly consistent, while Apple and Google were both a bit more varied.

Difficulties

I didn’t find anything about most of the programming too difficult, except for the rendering to a GitHub document part. I checked my code so many times, and it was not giving any errors or warnings at all. However, every time I tried to render the .Rmd file to a GitHub document it just gives me errors that just did not make sense. Eventually, I realized the answer was somewhat simple but also incredibly infuriating - I was getting an error because when I tried to knit the entire document at once, I was exceeding the maximum number of requests per minute of the API.

In the future, this is definitely something I will try to prep for ahead of time (althought I’m not exactly sure how, aside from making sure the output renders periodically). I had no idea something so mundane would present such a huge issue and be so difficult to trace back to the root cause. So I think just planning ahead in general will be a thing to improve in the future.

Resources

The link to the github pages site that contains the project is here, while the actual github repo is here.


<
Previous Post
What is data science, really?
>
Next Post
Programming in R