For the past couple of months, I have been doing the backend development program on boot.dev. The platform offers a backend development program that teaches you python, golang, git, docker and other useful technologies. It’s been a great experience so far where I have already advanced to the golang section which is tough but nothing a little persistence can’t overcome. The gamified nature of the program makes it enjoyable and changes are made consistently to enhance the learning experience. Check it if you are interested.
One thing I like aside from the quizzes are the guided projects where you have to implement a project yourself with minimal guidance and however you see fit. After each language section completed, you also have to do a project yourself. My idea for a project was to implement a python web scraper which uses AWS lambda to run, you can check it out here. It’s a work in progress and will be making improvements to it in the near future. This is something I want to talk about in this post. The projects I did while going through the program.
Web Scraper
You can check out the web scraper code here. Making use of the python language and libraries which include BeautifulSoup and Scrapy – python libraries used to scrape information from web pages. The first version of the program was done using scrapy to scrape a particular blog (my company’s). The idea was to scrape the blog posts and use that as the knowledge base for ChatGPT followed up by a slack bot to interact with it. Not an original idea but thought it would be fun to just have a slackbot to ask what’s in the company documentation to solve customer queries.
The scraping worked initially but had an issue when it came down to integrating with AWS lambda. I couldn’t initially figure out how to retrieve the URL provided to the form from the lambda event handler to my scrapy program. I also didn’t know at the time about packaging the code and it’s dependencies with Docker deploying that to Lambda.
After some trial and error, decided to simplify by switching to BeautifulSoup which I found easier to work with. Looking back, I probably picked up Scrapy as a tool because of all the additional features it had thinking those will come in handy for future updates to the project. Sometimes it’s better to follow the KISS (Keep It Simple Stupid) principle. The library helped a lot to make things easier to understand and after retrieving the URL provided to the form, it was straightforward to add it to the code since they were all in the same file.
If there’s one other thing I learnt from implementing this, it was scraping blog posts or pages in general does not have a universal solution since each website may be structured differently. Guess that’s why web scraping is still a highly valued skill especially in this AI/Machine Learning Ecosystem the world has going on. This is something am looking to also get into in the future.
PokeAPI CLI
The other project I worked on is a CLI application implemented using Golang to communicate with PokeAPI, an API that provides pokemon data. You can check out the project code here. This was one of the guided projects and I can definitely say this was challenging to finish. I learnt a lot about making requests to API and also a little bit about implementing a cache (the simple kind). Go is a pretty versatile language and one thing I like in the language is unused imports being removed or unused variables generating an error. This helps a lot to catch errors I would have missed in python.
The project also taught me about the json package and creating modules and packages for your project.
…..TBC