Every week, we pick a real-life project to build your portfolio and get ready for a job. All projects are built with ChatGPT as co-pilot!
Start the ChallengeA tech-culture podcast where you learn to fight the enemies that blocks your way to become a successful professional in tech.
Listen the podcastMachine Learning
Amazon Web Services (AWS)
Cloud computing is a model for delivering technology services over the Internet. Instead of having to purchase and maintain in-house servers and hardware, businesses and/or users can access computing resources, such as servers, storage, databases, networks and software, through cloud service providers.
In essence, cloud computing allows organizations and individuals to use computing resources flexibly and on demand, paying only for what they actually use. This provides several advantages, such as:
There are three main models of cloud services:
IaaS (Infrastructure as a Service) | PaaS (Platform as a Service) | SaaS (Software as a Service) | |
---|---|---|---|
Level of Abstraction | Low | Medium | High |
Management Responsibility | User (Operating Systems, Networks) | Vendor (Platform, Middleware) | Vendor (Application) |
Flexibility | High | Moderate | Low |
Scalability | High | Moderate | Limited |
Application Development | User Dependent | Platform Based | Not Required, Use Only |
Examples | Virtual Machines (AWS, Azure) | Google App Engine, Heroku | Salesforce, Google Workspace |
Cloud computing, in terms of Machine Learning and, beyond that, Artificial Intelligence, is nowadays used in all its forms; from using third-party tools to develop models as fully integrated development environments in the cloud, through local developments and deployment in the cloud (the latter the most widely used).
Although there is an infinite and very well distributed catalog of services to work in the field of machine learning, some of the most prominent and well-known are:
Cloud data warehouses are systems designed to store large amounts of information in an efficient and scalable manner. With the recent increase in the size of data sets and the computing power needed to run machine learning models, leveraging cloud resources is a necessity for data science.
In data management, depending on how the data is stored, guarded and what the intended use is, there are different technologies available.
A data lake is a repository that stores large volumes of data in its original, unprocessed format. This includes structured, semi-structured and unstructured data. The information is stored in its raw form, providing flexibility to analyze it in different contexts and extract valuable information.
This technology is especially useful for Big Data analysis and data exploration. Examples of technologies used in Data Lakes are Hadoop and cloud storage systems such as Amazon S3.
A Data Warehouse is a centralized system that collects, organizes and stores data from different sources within an enterprise in a structured format optimized for analytical queries. The data in a Data Warehouse is usually historical and is designed to support decision making based on reporting and analysis. Data warehouses often use dimensional models and fact tables to enable complex queries. Examples of Data Warehouses include Amazon Redshift, Google BigQuery and Microsoft Azure Synapse Analytics.
A data mart is a smaller version of a Data Warehouse. It is designed to address the specific needs of a department or user group within an organization. Data Marts contain a portion of the Data Warehouse data and are optimized for a particular business area. They are useful for enabling users to access and analyze relevant data in a more efficient and targeted manner. Data Marts can be independent or extracted from the main Data Warehouse.
The main difference between a data lake and a data warehouse has to do with the format in which the data is processed and stored. In a data warehouse we will always find structured and preprocessed data, and in a data lake, we will not. Making the decision on which technology to implement will depend on the type of data we are working with and the frequency with which it will be updated. A data warehouse is a more analytical environment, and is not intended for frequent queries or updates.