Page 96 - Code Craft Computer-8
P. 96

For example, test scores of a student, search keywords on Google typed by a user, etc. Data is
            very important today for all applications. It is considered as the digital capital or digital gold.

            Data Source
            Data can be obtained through observations, measurements, studies, or analysis. The common data
            sources are:

            •    Internet search engines and social media accounts
            •    Business systems

            •    Government systems
            •    Internet of Things (IoT) devices are internet-connected smart devices, such as smart air
                 conditioners and refrigerators
            Data as a Domain of AI: Data is used to train artificial intelligence models. Data is the key to
            artificial  intelligence.  Artificial  intelligence  needs  data  to  build  its  intelligence,  initially,

            subsequently, and continuously. Whenever you want an AI algorithm to be able to predict an
            output, you need to train it first using the data.
            In the above example of Amazon shopping, the AI algorithm has been initially trained with large
            volumes of data from users that shop on the website in terms of the products they search for and

            buy. Based on a user's age, search, and purchasing history, the AI has been trained to recognise
            patterns in the products they are likely to buy. The algorithm then predicts what another user is
            also likely to buy based on its training and recommends the products to users.
            Datasets: Related data is grouped into datasets. A dataset is a set of numbers or values that pertain
            to a specific topic. For example   , test scores of students in a class or what a user searched for on the
            internet on different days.

            Data Types in AI: With reference to AI models, data is of two types–training data and testing
            data.

            1.  Training Data
                 Training data is used for training the model (70% of the data). It is the initial dataset that
                 programmers use to teach a machine or machine learning application to recognise patterns
                 and trends. For better efficiency of an Al project, the training data should be relevant and
                 authentic. For example    , if the online shopping application is not trained on a correct data, the
                 application may recommend products that the user has no interest in buying and the user may
                 move on to another shopping websites.

            2.  Testing Data
                 Testing data is used for evaluating the model (30% of the data). It is used while testing or

                 validating the AI model's accuracy. For example       , if the online shopping application is trained
                 on  a  biased  data,  the  application  may  recommend  only  a  low-cost  product  to  a  certain
                 population or gender – this error can be identified during the testing process.

            Computer Vision
             As humans we can see things, analyse it and then do the required action on the basis of what we
            see. But can machines do the same? Can machines have the eyes that the humans have? If you


                                                                 96
                                                               Computer-8
   91   92   93   94   95   96   97   98   99   100   101