Easiest. Mentorship. Ever.

Do you know a young woman in high school interested in learning about leadership and politics? Or maybe you know a someone who isn’t totally sure about pursuing public office but who wants to make a…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Setting Up Google Colab for CNN Modeling

During my journey as a Data Science student, I created a project using Convolutional Neural Networks (CNNs). CNNs are deep learning models that are becoming increasingly popular in the world of Data Science for image classification.

There is just one major issue when you’re building them at home — the training time for these models is very long!

Enter Google Colab.

Colab is great because it allows you to run your notebook on a hosted computer that is most-likely better/faster/stronger than your local machine. That means faster training for your CNN model.

In Colab, you can make use of GPUS! Normally your computer is running on a CPU (Central Processing Unit). The CPU is the brain of your computer, which will run your programs and, in this case, train your model. Even though you can split the work between multiple cores of your CPU, it won’t be as fast as a GPU for machine learning. A GPU (Graphics Processing Unit) is a processor made up of many smaller, specialized cores, which means that it can split the work between even more cores and therefore get your CNN training done faster. The GPU is designed for parallel processing and it will be much faster than a CPU in this use-case.

Make use of GPUs with High-Speed RAM with Colab Pro. For only $10 per month, Colab has an upgraded option that gives you access to even faster GPUs and longer runtimes (AKA your notebook won’t time-out/disconnect as often). It also gives you access to more RAM, thus you will have more memory for your data. I actually went with this option because I believe $10 is super cheap when you compare it to the time-savings you will experience (but I also know people who have had lots of success and speed just using normal Colab).

Therefore, use Colab to save time! When I compared my model’s training time using Colab to another students’ training time on their local machine, Colab could train a model in 3–4 minutes (approx. 7 seconds per epoch) vs 6 hours or so on the local machine (approx. 5 minutes per epoch). Now I know that everyone’s computer will be different, but that is a HUGE time-savings even with slight variations in your local machine’s speed.

Prerequisites: A Google account

Very important: You should set your runtime before you do anything! If you forget to do this before you run anything, you will need to restart your kernel when you finally set it to the high-speed option. To do this, go to the top toolbar and go to Runtime > Change runtime type. You can then change your hardware accelerator to a GPU, and if you opted for the Colab Pro option, you can also set the runtime to High RAM.

Next, we will set up a way to get ahold of your data. Here’s the thing about Colab… unlike your local machine, you need to add your data to the machine you’re working on. Anything you upload to Colab will go to the content folder. However, you can also mount your Google Drive and use that to pull data and files into your notebook. There is a toolbar on the side that you can use to see what’s happening and where things are going.

After this step, mount your Google Drive to this local machine. When you do, you’ll be able to pull those files out and/or save your models back into your Drive for later use. When you run this cell, it will ask you to go to a link, authorize Colab’s use of your files, and copy and paste that authorization link back in the notebook.

To make sure this worked, check that folder toolbar again and look for a folder called gdrive on the same level as all the other folders. If you look in this folder, you should see everything that’s in your Google Drive. (the code for this is below).

Now it’s time to get your data! Either upload a zipped file to Google Drive in a new tab or pull the data from an API in the notebook you just created.

Finally, you’re going to unzip the file, but the trick to this is to copy it to the Google Colab machine before you unzip. The reason for this step is to avoid having to reach all the way into Google Drive every time you access this data because it slows down the process. In this way, everything will be unzipped and ready to go in the local machine that is using the GPU, high RAM, etc so you have the fastest setup available.

And voila! Your file should now be unzipped and in the content folder. You can double-check to make sure it’s there by navigating the side toolbar.

If you’re lucky, your data will already be split into these nice folders of images for your train/test/validation split, but otherwise, you can create those folders manually, or get your data into the notebook first and then do a train/test/val split using sklearn’s TrainTestSplit method.

Now we can move on to preprocessing the images and loading them into the notebook from Colab’s local machine!

In the code below, I preprocessed the images with the ImageDataGenerator and then iterated through the generators with the next function to get my data into the X, y formats needed for modeling. The process and code are below:

From here you can move on to building, training, and evaluating your Convolutional Neural Network! The speed from using the GPUs will allow you to experiment with building and tuning multiple models without having to wait for hours in between each one’s training.

Of course, Colab has both pros and cons, so you should consider both before getting set up.

In conclusion, I would use Colab and Colab Pro for any major machine learning project to save time! I think it’s a great tool for Data Scientists, despite a few minor annoyances. These are just some of the things I learned about Colab during my latest project, so if you have any more tips, pros, cons, personal experiences, etc, please leave a comment below!

Add a comment

Related posts:

Will Gen Z call for reform or revolution?

As each generation comes of age, they bring their own ethos to crafting the implicit and explicit agreements that balance the competing needs of individuals, the collective and institutions in civil…

A Freer Name

When nothing calls you to the same aliveness You have outgrown your own eyes You’ve moored yourself too long in a harbor too small for you The time has come to hold your conversations in a truer way…

Illusion Of the Mind

Our mind is a wonderful creation with boundless potentials. It has the power to achieve the unachievable. To let lose your mind that has such immense capacities will result in a lot of dilemmas…