What is SageMaker?
SageMaker is a cloud machine-learning platform that was launched in November 2017 by Amazon. SageMaker enables developers to create, train, and deploy machine-learning (ML) models in the cloud. SageMaker enables developers to deploy ML models on embedded systems and edge-devices. Amazon SageMaker includes modules that can be used together or independently to build, train, and deploy your machine learning models.
What are the capabilities of SageMaker?
1. Makes building easier
Amazon SageMaker makes it easy to build ML models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it is easy to explore and visualize your training data stored in Amazon S3. You can connect directly to data in S3, or use AWS Glue to move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis in your notebook.
To help you select your algorithm, Amazon SageMaker includes the 10 most common machine learning algorithms which have been pre-installed and optimized to deliver up to 10 times the performance you’ll find running these algorithms anywhere else. Amazon SageMaker also comes pre-configured to run TensorFlow and Apache MXNet, two of the most popular open source frameworks. You also have the option of using your own framework.
2. Single-click training
You can begin training your model with a single click in the Amazon SageMaker console. Amazon SageMaker manages all of the underlying infrastructure for you and can easily scale to train models at petabyte scale. To make the training process even faster and easier, AmazonSageMaker can automatically tune your model to achieve the highest possible accuracy.
3. Easy deployment
Once your model is trained and tuned, Amazon SageMaker makes it easy to deploy in production so you can start running generating predictions on new data (a process called inference). Amazon SageMaker deploys your model on an auto-scaling cluster of Amazon EC2 instances that are spread across multiple availability zones to deliver both high performance and high availability. Amazon SageMaker also includes built-in A/B testing capabilities to help you test your model and experiment with different versions to achieve the best results.
Amazon SageMaker takes away the heavy lifting of machine learning, so you can build, train, and deploy machine learning models quickly and easily.
Is SageMaker just Jupyter?
No. At the most basic level, SageMaker provides Jupyter notebooks. You can use these notebooks for building, training and deploying ML models. Many data scientists use these notebooks for exploratory data analysis and model building stage. You might want to start off using something like Pandas here to explore the dataset — how many missing rows do you have? What does the distribution of data look like? Do you have imbalanced data and so on? You can build many different models ranging from say logistic regression or decision tree from Scikit Learn to deep learning models from Keras and get baseline performance very quickly. So when you move to SageMaker the notebook interface remains the same — there is no difference!
Can you use SageMaker for free?
Amazon SageMaker is free to try. As part of the AWS Free Tier, you can get started with Amazon SageMaker for free.
What is the difference between SageMaker and EC2?
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers. Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment.
Amazon EC2 offers the broadest and deepest compute platform with choice of processor, storage, networking, operating system, and purchase model, built with the fastest processors in the cloud and we are the only cloud with 400 Gbps ethernet networking. It has the most powerful GPU instances for machine learning training and graphics workloads, as well as the lowest cost-per-inference instances in the cloud. More SAP, HPC, Machine Learning, and Windows workloads run on AWS than any other cloud.
Whereas SageMaker makes deploying Machine Learning applications easier. An Amazon SageMaker notebook instance provides a Jupyter notebook app through a fully managed machine learning (ML) Amazon EC2 instance. Amazon SageMaker Jupyter notebooks are used to perform advanced data exploration, create training jobs, deploy models to Amazon SageMaker hosting, and test or validate your models.
The notebook instance has a variety of networking configurations available to it. In this blog post we’ll outline the different options and discuss a common scenario for customers.
What are the benefits of SageMaker?
1. Model Building
At the most basic level, SageMaker provides Jupyter notebooks. You can use these notebooks for building, training and deploying ML models. Many data scientists use these notebooks for exploratory data analysis and model building stage.
You might want to start off using something like Pandas here to explore the dataset — how many missing rows do you have? What does the distribution of data look like? Do you have imbalanced data and so on? You can build many different models ranging from say logistic regression or decision tree from Scikit Learn to deep learning models from Keras and get baseline performance very quickly.
So when you move to SageMaker the notebook interface remains the same — there is no difference!
2. Model Training
You can use the same notebooks to train the model and store the model artifacts and files in S3 and then move to the next step of model deployment. But what if you are doing a model that takes several hours to train, say, language translation model with complex LSTM models?
In this case, instead of using the notebook itself which might be running on a small instance, you can just call a GPU from Sagemaker notebook itself to train the model. This way you can save the cost while running the notebook as most of the tasks are around building, checking and exploring the model. Again, for training the model training itself, you can just use another machine which is different from the machine being used to run the notebook.