Crossroad Development

two roads crossed to make an X

Database Diversity

2025-05-04

Looking back, I think the database class offered a lot of information and exposure to best practices that I probably would not have encountered in solo research. Particularly organizational practices like Entity Relationship Diagrams (ERDs) and how to organize and present a formal project proposal.

One of the more interesting assignments was an essay outlining 3 different types of databases. I may have missed the 3rd intended type of database, but I observed a type is very relevant in the current AI tech climate, vector databases. And rather than rehash what I considered in the essay, I will just embed it here, and hit a few high points of why this could matter in a practical setting.

Considering how much has changed since 2023 in terms of AI and database technology, there are new considerations. I would imagine that integrating and syncing traditional databases into a vector database would be much more streamlined with the wider use of agents, plugins, and reasoning models. As well as token limitations. I mentioned the database struggles after a million vector limit, but I’m not sure if I meant to say tokens there, since PGvector has a 2000 dimension limit on their vectors and Qdrant has something in the order of 64k dimensions with the ability to set that even higher. Either way Qdrant is still more performant than general purpose databases with a vector plugin. Redis is however no longer considered the default key-value in memory data store thanks to their open source license change. I’m sure Amazon’s keyval fork has a compatible vector database option, and it might be intuitive that an in memory db option would be great for having a model being able to access the db as context. However, dedicated vector databases also allow for an in-memory mode.

I have a plan of when I will try to first hand explore this world, because jumping into the totality of the AI world is daunting. I also would rather not be pigeonholed into a specific model, or way of doing things without the background knowledge that supports a solid foundation in any system. For that reason, I have been reading “Neural Networks from Scratch in Python” by Harrison Kinsley & Daniel Kukieła. I think after that point, I want to try and run an uncensored model like Dolphin or the newer Deep Seek one. Finding an application to build using this tech could also be a challenge. It shouldn’t be as hard as finding utility for crypto that isn’t MLM schemes or gambling, but a suitable dataset, combined with a user interface that makes sense or integrates with other AI tools, and finally justifying the benefit it provides for burning so much more energy compared to traditional applications.

This essay was rather early on in the class, so there are a couple other documents I would like to blog about as well, but I think I should keep rotating commenting on the full breadth of classes and how it relates to projects I’ve worked on since then.

Comments