[Week 3] Image Segmentation Models

I was intrigued by the cityscapes model that managed to pretty accurately segment the different parts of an image. The different shapes and colors that made up the segmentation mask did at times really remind me of the underlying things they were outlining. This led me to start thinking about how much of this image could be recreated using these basic shapes and labels as the underlying foundation.

I thought about how to best achieve this and ended up utilizing some resources from a previous class, courtesy of Dan’O, that did inpainting through replicate.

Inpainting Base code provided by Dan’O

I then attempted to use the segmented image mask as the mask to inpaint above the original image of the cityscape. To my surprise it actually worked quite well and output a very similar image with some slight changes in positioning of vehicles and some blurry street signs.

Screen Recording 2025-02-09 at 3.36.04 PM.mov

At this point I was expecting a more distorted and warped image and wanted to see the segmented image to devolve into something unrecognizable. Thus, with the help of claude.ai I altered the code to have a recursive feature whereby the inpainted image would become the new base image and then be segmented again in an endless loop.

Code

Screen Recording 2025-02-09 at 3.52.10 PM.mov

To my absolute surprise this still seemed to work. It was only once I began to alter the model being used and set all the base parameters to be “fastest,” thus less accurate i started to get the effect i was looking for initially

Code

Screen Recording 2025-02-09 at 3.51.13 PM.mov

I thought more about how some of these segmentation models were outputting strange patterns when i set them to their fastest parameters. Some of them were even visually satisfying and it made me think about using the segmentation models with no image or little image input, so allowing them to go wild segmenting things that were just not there.

Screen Recording 2025-02-11 at 6.06.29 PM.mov

Here I was just supplying it with a single frame of video per 5 iterations of segmentation. You can see that it goes a little wild with some cool patterns inbetween when real video is provided. I decided to use the “ade20k” for this since it was always the model that gave the most categories and wild segmentation patterns

Code

Screen Recording 2025-02-11 at 6.11.54 PM.mov

Here I decided to add pixelation to the pattern and stopped adding real video frames (only supplied one at the very beginning). I am not sure if I like this version better but it is certainly something and very abstract at that.

Code (warning extremely laggy run at your own risk)