William Falcon, CEO, and Founder of Grid, tells us about building the next generation of AI tools and frameworks.
Tell us about you, your career, and how you founded Grid.
William Falcon: I built PyTorch Lightning while completing a Ph.D. in deep learning at NYU. The goal was simple: I wanted to try many ideas as quickly as possible without being limited in flexibility (which is extremely important to me as a researcher) without dealing with boilerplate or the challenges around scaling models. At the time, I was at Facebook AI Research, where adoption started, and it then spread to the rest of Facebook and other companies. Today, PyTorch Lightning is a global, community-driven open-source framework used across the top AI labs and companies in the world.
In building this framework, one of the key pain points that kept surfacing by industry and academic labs was scaling up model development on the cloud across hundreds of machines with multiple teams working together. This is what Grid was built to provide: the ability to scale up serious workloads on the cloud in a highly dynamic, collaborative environment while satisfying all the enterprise requirements like security and soc-2. We built this in such a way that the MLOPs would become invisible, allowing people to focus on solving their business and research problems instead of learning MLOps or Kubernetes and dealing with engineering challenges.
How does Grid innovate?
William Falcon: We make the tooling disappear. It’s what the iPhone did for phones: you don’t think about how the network, wifi, or radio signals work. You just turn it on and use it. Today, AI is stuck in the Stone Age. Every person involved in deploying these tools has to be an expert in math and the cloud and engineering, and a half dozen other fields when all they want to do is use AI to solve a meaningful problem as quickly and efficiently as possible.
How did the coronavirus pandemic affect your business?
William Falcon: Our DNA is open source. This means that the notions of distributed community and remote-first are embedded in the fabric of who we are as a company. The coronavirus pandemic just forced us to get better at teaching engineers who aren’t familiar with a distributed, community-first mentality how to work and think this way. Now, we have employees in over 15 countries across all parts of the company.
What are the current trends in the Machine Learning space, and how is the industry changing?
William Falcon: As the demand for AI/ML in business settings continues to increase dramatically, and as the number of data scientists, researchers, and engineers entering the industry only continue to grow, there’s no shortage of innovation happening to make machine learning easier, faster and more scalable. We’ve seen an uptick in many no- and low-code offerings, of which the Lightning framework is a great example. We also see the need for Machine Learning Operations (MLOps) to create an environment where data scientists and operations are able to collaborate as effectively as possible.
One of the changes in the industry to which we are the most committed in ensuring that access to these technologies is available for as wide a range of users as possible. For us, this means nurturing and maintaining a robust community of experts who continuously build and develop our open-source framework because it enables them to overcome challenges that would otherwise prove to be insurmountable. Leveraging the strength of our open-source community is at the top of our minds as we think about building AI products for the future.
Who is the target audience of Grid? Do you have any interesting partnerships?
William Falcon: Researchers, prototypers (i.e., data scientists, researchers at medium/large companies), and ML engineers.
Lightning has partnered with Nvidia’s NeMo and Grapcore’s Lightning/Graphcores IPU integration. We’ve also recently acquired Tensorwerk, and by working together, we plan to empower data scientists and researchers to train even more advanced models faster and cheaper than ever before.
Tell us more about your platform/technology.
William Falcon: We began by building PyTorch Lightning, an open-source research framework for PyTorch that enables users to scale their models without managing changes to their code. As our community grew and we began to learn more about their needs, we built Grid, our in-house platform for training models on the cloud from your laptop. Across our offerings, we are focused on enabling machine learning experts to easily train models at scale, facilitate collaboration, and reduce the operational burden of deploying AI tools.
How do you see the future of Grid? What are the next steps for the company?
William Falcon: We’re currently hard at work building the next generation of AI tools and frameworks. Our core vision remains consistent with the goals we had in mind when building PyTorch Lightning: democratizing access to AI tools and products, reducing the overhead of deploying ML technologies, and revolutionizing how the world experiences software.