Computer Vision

A colleague at work has been experimenting with feeding streamed camera and screen sharing video into Gemini 2 Flash and the results are really good. The model can accurately describe things in near real-time and is able to make an overall assessment on what activity was being carried out during the screenshare.

Computer vision is one of the more mature areas of machine learning anyway but the speed and ability to describe a higher-level set of actions rather than just matching individual items or pictures was brand-new to me. The functionality can even work in near realtime using the standard browser camera API.

Lessons learned

Although I kind of knew this it managed to catch me out again. Alembic, the Python migration manager for the SQLAlchemy ORM doesn’t really have a dryrun mode and the history command doesn’t use the env.py file and instead works (I think) by simply reading through the migration files.

People have said that switching to the SQL generation mode is an alternative but I was working in the CI pipeline and probably would have needed to also come up with a minor migration file to have some useful feedback.

Echo One

Sequentially arranged sentences composed of words (and punctuation)

Tag Archives: Computer Vision

May 2025 month notes

Computer vision

Lessons learned