Page 96 - Code Craft Computer-8
P. 96
For example, test scores of a student, search keywords on Google typed by a user, etc. Data is
very important today for all applications. It is considered as the digital capital or digital gold.
Data Source
Data can be obtained through observations, measurements, studies, or analysis. The common data
sources are:
• Internet search engines and social media accounts
• Business systems
• Government systems
• Internet of Things (IoT) devices are internet-connected smart devices, such as smart air
conditioners and refrigerators
Data as a Domain of AI: Data is used to train artificial intelligence models. Data is the key to
artificial intelligence. Artificial intelligence needs data to build its intelligence, initially,
subsequently, and continuously. Whenever you want an AI algorithm to be able to predict an
output, you need to train it first using the data.
In the above example of Amazon shopping, the AI algorithm has been initially trained with large
volumes of data from users that shop on the website in terms of the products they search for and
buy. Based on a user's age, search, and purchasing history, the AI has been trained to recognise
patterns in the products they are likely to buy. The algorithm then predicts what another user is
also likely to buy based on its training and recommends the products to users.
Datasets: Related data is grouped into datasets. A dataset is a set of numbers or values that pertain
to a specific topic. For example , test scores of students in a class or what a user searched for on the
internet on different days.
Data Types in AI: With reference to AI models, data is of two types–training data and testing
data.
1. Training Data
Training data is used for training the model (70% of the data). It is the initial dataset that
programmers use to teach a machine or machine learning application to recognise patterns
and trends. For better efficiency of an Al project, the training data should be relevant and
authentic. For example , if the online shopping application is not trained on a correct data, the
application may recommend products that the user has no interest in buying and the user may
move on to another shopping websites.
2. Testing Data
Testing data is used for evaluating the model (30% of the data). It is used while testing or
validating the AI model's accuracy. For example , if the online shopping application is trained
on a biased data, the application may recommend only a low-cost product to a certain
population or gender – this error can be identified during the testing process.
Computer Vision
As humans we can see things, analyse it and then do the required action on the basis of what we
see. But can machines do the same? Can machines have the eyes that the humans have? If you
96
Computer-8

