Hey everyone, this past school year I was given the opportunity to participate in a program known as UCOSP. Basically, it was for-credit course that allows students from various universities across Canada to work on open-source projects in small (5-7) groups with an industry mentor. I had the honor of working with three other talented students from the University of British Columbia as well as with two friendly and intelligent Mozilla employees on the Jupyter Notebook project.
So what exactly is a Jupyter Notebook, and why should you really care?
Well, the Jupyter Notebook is an electronic document that can contain Python code as well as flexible human-readable text such as paragraphs, comments, figures, equations, etc. The code within these documents can be executed and analyzed.
One of the main challenges I have overcome during this experience was learning to function within and adapting to the somewhat distributed working environment. Previous assignments I’ve worked on were with a local team of developers usually in person. People could be called or messaged and then ideas could be bounced off each other on whiteboards, over a cup of coffee or even on the same computer.
This new approach had team members all over Canada working together (British Columbia, Ontario, Nova Scotia) as well as internationally (England). The first and most obvious obstacle was that with each different area, there was a different time zone. When scheduling meetings or Skype calls, it was common to find a reasonable time, taking into consideration the up-to-nine-hour time difference. Outside of these scheduled weekly meetings, we often found ourselves communicating via IRC or Skype, both platforms which I am familiar with.
You might be asking, “Well I know what the notebooks are, so what is your team doing?” Our team worked on various add-ons for the Jupyter Notebook – extensions that will be released open-source and accessible to the public.
Our second extension was an add-on for Apache Spark integration. Apache Spark is a framework for cluster computing – a system where distributed machines can be used together to perform large computations. It is very useful when data processing, something that Mozilla would definitely use to process large volumes of Firefox usage information. If, within the Jupyter Notebook, an Apache Spark job was ongoing, we wanted to provide a progress bar. Unfortunately this task had just gotten off the ground as the university’s Reading Week began, and I had already made plans to be out of the country. I kept tabs on the progress (IRC) during that week, but sadly did not have time to make significant contributions. Upon return, I tried to quickly ramp up, but to little avail. Apache Spark does not like to run on native Windows, so I installed VirtualBox with Ubuntu 15.04 to get it working. Only after much debugging and installing, did I realize that all the tasks for the Apache project were taken and/or in progress by another team member. The feeling of insignificance returned, but after expressing my concern to the Mozilla team lead, he mentioned that it might be better to get started on the next project which would soon be fleshed out and discussed. At the end of our meeting, he mentioned someone around the office was looking for C++ assistance, which is my area of expertise but unfortunately that did not lead anywhere.
Lastly, our third and final add-on was for Amazon S3 cloud storage. The team wanted to be able to use Amazon’s S3 cloud storage platform as a notebook store so that files do not have to be stored locally. This project was mainly exploratory since several other open-source contributors had been working on something similar. I investigated and gathered some information that guided our development, such as whether there was any workable code we could fork. There was a similar project designed for Google Drive that I noticed, perhaps that project could be adapted to support S3. Unfortunately the school term had ended before any significant progress could be made, however, I intend on pursuing additional open-source projects such as this even afterwards – specifically, right now!
I hope to continue working with Mozilla to get something working. I definitely enjoy working with everyone, and my mentors are very intelligent, open and approachable individuals. I feel like working in an open-source environment provides me with unrestricted access to the world’s knowledge, and even someone like myself can share everything that I have learned. Being able to see people happily using what I have worked on is such an amazing pleasure and honor for me, as a software developer. Rhe journey into the world of open-source is a never ending road of exploration, learning and sharing. Mozilla is a wonderful company with many wonderful people. It would be a dream to be given opportunities to work with them in the future.