Open-Sourcing breast_segment

Fully automated Breast Segmentation on Mammographies

After having tried (and failed) to train a Convolutional Neural Network (CNN) to identify Mammographies with malignant signs, I decided to open-source most of the pre-processing tools I had developed in the process.

breast_segment is one of those. I was using Mammographies from the Digital Database for Screening Mammography which were taken by using traditional X-Ray film and later digitalized (ugh!), there was lots of noise (dust, text) which had to be removed prior to training any sort of machine learning model on it. The seemingly obvious way was to develop some sort of automatic segmentation of the breast area.

In medicine and especially medical research, sharing your data or even source code is, well, not only uncommon but often actively discouraged. It didn’t come as a surprise that, of the published breast segmentation algorithms, no implementation can be found.

Original Image Computed Segmentation Mask Computed Bounding Box Overlay Visualization

So I developed my own, based on Felzenzwalb’s Algorithm. It works by basically just looking for the largest connected region - hardly specific to Mammographies - and creating a boolean segmentation mask. In around eight out of ten images, it works (good luck segmenting the remaining two).

Have a look at the Repo on GitHub! I also provided an IPython Notebook walk-through example.

Good luck in your research and feel free to contact me if you have any questions.