I've always wondered as to what happens in the 'backend' of the training process in Neural Networks. The training process is essentially the 'meat' of the model; without efficient and effective training the model will not be able to accurately predict/classify or accomplish a task with newly unseen data. Neural Networks have always been held as a very useful system in prediction analysis, recommendation engines, modeling and a lot more; it is because has the ability to extract complex patterns and relationships between the input and output data. This is because of the numerous layers of neural networks, that make the machine learning 'deep'. So what does the training process of this neurological-biological computer model-system (aka neural network) look like? It consists of two phases: 1). Forward Propagation and 2). Back propagation. Once we have preprocessed the dataset (such as normalization, reshaping your data to a specific dimension..etc.), we are ready for the training process. We first forward propagate our data—meaning we feed in the inputs into our built neural network. I like to think of forward propagation consisting of three simple steps: 1) Send in a data point (or a subset of training data points), 2) the layers obtain the data, process it (by computing the dot product), and pass it into some non-linear activation function (ReLU, Sigmoid, Leaky ReLU), 3). Output from previous layer is passed to next layer. The process continues until we have reached the final layer. Before we get into the deep math, let's define some variables. These definitions are applied to every example in this blog post!
SO, what happens after we have passed an instance of our training dataset to the network? Back PropagationThis is where back propagation comes in. Back-propagation is essentially an algorithm used in training neural networks and is used to improve the model's performance/accuracy. Our neural network will update its parameters (weights and biases) once every time a number of samples is passed into the network (aka: an epoch in gradient descent algorithm). Basically, in Back Propagation, we need to adjust the weight parameters based on our loss function — whether it be calculating the mean squared loss, absolute mean loss or any other loss function we use — and our updated weights must head towards the right direction (in reducing the loss). Let's look at an example, and focus on one specific output neuron in a neural network model. We first pass in our input training data, and get a single output neuron of 0.5. However, the specific target value for the neuron is 1.0, so the cost function computes the error, and our optimizer uses the loss to back-propagate and change the weights to minimize the loss. This is why the process is at times called back-propagation of errors, since updates to one layer impact the next layer, and because of this backwards calculation technique, we are preventing redundant calculations. To do this, we need to figure out how much of an impact does a certain weight (between a node from one layer and a node from the previous layer) have on the cost function. Does the weight have a very low impact (ie. the cost function doesn't drastically differentiate when the weight parameter changes), or does it have a very important contribution towards minimizing the loss? In order to find this out, we need to look into a very important operation: partial derivatives. The Process
Let's go do a back propagation run through with an example. Say we had a neural network with 3 input neurons, 2 neurons in the hidden layer, and 1 output neuron. NOTE: In each neuron, the number on the left side is the input, and number on the right side is the activated input (passed through activation sigmoid). Back-propagation of the output layerThe total number of weights in this network is 8. We have 6 weights in the first layer and 2 weights in the second layer. Let's first focus on the computing the partial derivative of the cost with respect to one of the weights contributing towards the output neuron: w7. What is the partial derivative of C, the cost, w.r.t to w7, weight 7?
Back-propagation of a hidden layerNow, let's back propagate to the first layer, and compute the gradient of weight 1, w1. The partial derivatives will look slightly different this time, since the cost isn't directly associated with the hidden layer outputs. With our focus on weight 1, the partial derivative of the cost w.r.t w1 is going to depend on the partial derivative of the cost w.r.t. the output of the hidden layer neuron. This will look like:
So in order to find the partial derivative with respect to the hidden layer output, we need to include the next layer's weight's contribution towards the cost. Therefore, we need to include the weights that are being multiplied by the activation output of the particular hidden layer's neuron. In this case, it's only one weight: w7. However, if there are multiple weights contributing, we would have to add them to the equation as well.
The partial derivative of the Cost w.r.t. the hidden layer activation is broken down again into three parts: 1). First starting from the cost w.r.t to the output activation, then 2). the activation w.r.t the output dot product, 3). then the output dot product w.r.t. hidden layer activation. As you can see there are many terms being repeated in finding the updates for both weights 1 and 7!! This basically sums up finding the gradient of the network. As you can see, it requires lots of computation and derivatives!! Imagine the workload of computing the gradient vector of bigger neural networks with millions of parameters and neurons--backward propagation does this by using the previous layer's computations to compute the updates for the next layers! It's an amazing feat! Back propagating in actionSo this is where 'back propagating the errors' comes into play. The image above is an example of back propagation in action for weight 1. The previous partial derivatives are reflected in the update value for weight 1. The white arrows show the path of each derivative computation for weight 1's update.
This same process occurs for every weight in the network during back propagation. Back Prop is one of the most important parts in training supervised learning neural network models! :)
1 Comment
When it comes to machine learning and computers being able to learn and recognize patterns--similar to what our brains do, (which is why ML/AI fields are so related to the neuroscience!) we want to be able to increase the accuracy and efficiency of our prediction algorithm. This is so that the prediction gets better and better, closer to the target value we are aiming towards achieving. Let's compare this situation to a more human-related real life scenario. Suppose you are studying for an exam. You have a set study plan, and followed the points in your plan to prepare. Unfortunately, you weren't too satisfied with your result, since your exam result was way off the targeted score you wanted to achieve. So, what do you do? Well, you would want to make some adjustments to your study plan in order to prepare for your next exam effectively and efficiently. You would want to make adjustments based on what you think you should have worked on more, i.e. covering more broad topics next time, reading over applications, practicing more problems..etc. You are doing this because you learned from your past experience, and are making some changes to perform better, and get a better result. This is exactly what optimization in machine learning is all about. It's about making these small adjustments to see if an algorithm's accuracy is improving. The algorithm wants to get as close to the target value, and optimization techniques can allow us to choose and make adjustments to our paramaters of our machine learning model (also known as the weights), so the algorithm performs better next time. One of the most well known optimization methods used today is called the Stochastic Gradient Descent, (or SGD in short). How does this method work, and what certain adjustments does it make to our algorithm? Before we add the word "Stochastic"Before we dive into what Stochastic Gradient Descent is, lets take a look at what the term "Gradient Descent" means. When we want to increase the accuracy of a machine learning algorithm, we are looking for our error difference. What is the difference betweeen the target value and the predicted value from our model? Let's say we had a model that outputs a numerical real valued number based on some arbitrary inputs. The predicted value (the value that is produced based on the model parameters) is 1.45. The actual target value is 3.32. This doesn't look that much of a good 'learning'-based model, since the target and predicted values aren't too close. The error difference is then 3.32-1.45 = 1.87. We want the target value to be close to the actual value, so how do we get a good understanding of whether the algorithm is functioning terribly or well? Cost FunctionWe can use a Cost Function to analyze the model's ability to understand and learn patterns and relationships between inputs and outputs. A cost function is usually a function of the error difference, and our goal in machine learning problems is to MINIMIZE the cost function. There are many cost functions we can use to optimize models. Examples are Mean Squared Error (MSE), Mean Absolute Error (MAE), and a lot of other more math-heavy trickier ones :) Now we might be thinking--why can't we just work with our error difference and minimize our error difference itself? Why do we have to work with a cost function? Our error function can also be negative--in this case it is 'minimized' in terms of the value, BUT, there still exists a huge difference between our predicted and actual values. So this is why we are going to instead focus on some function of our error, rather than just the error itself. For instance, if we are focusing on improving a single linear neuron's output prediction, (in a neural network), we will want to obtain a set of weights which will minimize the cost function. This means we will have to find the minima of the function--where is the lowest point of this particular function? This neuron has three inputs and 3 associated weights for each feature. The output is then the dot product of the weights and inputs, which is then passed into an activation function, such as ReLU. Now, how would we essentially train our model to adjust our weight parameters to improve the accuracy? Since we use functions to get to the output of this neuron, we need to do just the opposite to backpropagate, i.e. go back change the weight parameters to perform better. This means we are gonna get into derivatives. So we know that our cost function is technically a function of the weights, since computing the difference between the target and predicted value involves the predicted value, which involves the dot product of the weights and inputs. The main goal of gradient descent is to use the gradient to iteratively decrease the function of weights/cost function such that we find a set of weights for where the accuracy is maximized. The Gradient and the DescentWe can first set arbitrary values to our weights, for instance, starting from 0. Then, from that point, we need to find out which direction to go towards in order to reach the minimum value of the cost function. The gradient of the cost function tells us exactly that. The gradient, also denoted with a upside down triangle, represents the slope of the function with respect to each weight. The gradient of the cost function is basically a vector of partial derivatives with respect to each weight (ex. w1, w2,..etc). *Note: We are assumming this cost function is a strictly convex function, i.e. a function which has no more than 1 minimum point.* When we are updating our weights, we subtract the gradient multiplied by the step size with the old set of weights. The w in the equation above is basically a vector of weights, in this case a vector of two weights w1, w2. The gradient is also a vector with the partial derivatives with respect to w1, w2. This process of updating the weights iteratively happens until the gradient is at or at most close to 0 (this will indicate it's at a minima). The step size basically tells us how fast is the iterative process towards convergence. For instance, a small step size will slow down the convergence process, and a large step size might lead to an infinite amount of iterations..,divergence, which is not good! That is why it is so important to find the right step size amount. Back to the 'Stochastic'Ok, so know we looked into Gradient Descent, what does the term "Stochastic" mean? Stochastic is all about randomness. What can we do to this algorithm that involves randomness though? Though the Gradient Descent algorithm is very effective, it isn't too efficient for huge amounts of data and parameters. The Gradient Descent algorithm updates weights only after one epoch (complete pass of the training data). It would have to compute the gradient after every iteration (after one pass of all the training samples). With a large amount of features and weights, this can be computationally exhaustive, and time consuming. The Stochastic Gradient Descent, or SGD introduces a sense of randomness into this algorithm, which can make the process faster and more efficient. The dataset is shuffled (to randomize the process) and SGD essentially chooses one random data point at every iteration to compute the gradient. So instead of going through millions of examples, SGD using one random data point to update the parameters. This computationally makes the algorithm a bit more efficient. Consequently, it can lead to excessive 'noise' on the pathway to finding the minimum. For example: The Circle Maps represent the cost function, from the top-eye perspective or point of view. The path to finding the minimum point of the loss function looks very different for Gradient Descent and SGD. The trajectory towards the minimum looks very different for both of these variants of Gradient Descent. For SGD, the path is heading towards the minimum with harsh, and abrupt turns (because we are using random samples to update weights), and the path for the Gradient Descent algorithm is more smooth, and direct since we are using all of the training samples to compute the gradient. This is a very interesting aspect that looks visually different for both the optimizers. ApplicationsThe Stochastic Gradient Descent algorithm is one of the most used optimization algorithms in Machine Learning. It is very popular in deep learning and neural networks as well!
The Gradient Descent Algorithm has applications in Adaptive Filtering (learning based systems), and is used to optimize the filter's weights to minimize a cost value. An example of a particular Adaptive Filter is Noise Cancellation algorithm in headsets!!! :) ***During the COVID-19 pandemic, it is important to remember not to panic, but to be precautious and stay safe. Please be sure to wash your hands thoroughly and properly, and practice social distancing at this time.*** Stay safe we can fight this ❤️ With the global pandemic increasing in scale and spreading quickly, it is very important for us to stay precautious, safe, and healthy. There needs to be an efficient way to make diagnosis times easier for doctors, nurses and medical professionals so that they can focus on treatment immediately, and one certain way is to use neural networks and deep learning to find a solution to this problem for us. About Neural NetsA Neural Network, or a Neural "Net", is a deep learning model which sort of acts like a real biological neural network. It is a system which learns a specific pattern by taking in examples, without being explicitly programmed to achieve the task. To build or implement a neural network through code (for example, Python), there are 4 steps to essentially follow: 1. Set the architecture of the model , 2. Compiling the model, 3. Fitting the model, 4. Predicting with the model. Before we delve into the four step process, lets take a look at how a neural network looks like: A neural network is composed on neurons (or nodes) and layers.
Forward and Back PropagationThis process is called forward propagation. The neural network makes predictions using the process of forward propagation. It takes in the input feature values as the input layer, maps the target values at the output layer, and performs the computations, until it leads to some output target value. Now, what if the network makes a prediction which is not accurate? What if the prediction the network makes is far from the actual value? This is where we need to update the weights so that the values are more closer and accurate with the actual target values. In order to achieve this, the network goes backwards from the output layer to the previous hidden layer(s) (all the way up to the input layer) and updates the weights using an optimizer so that the next predictions are more accurate. An optimizer is essentially an algorithm which helps us minimize the error of the predicted vs the actual target value. Every time we train our network, our goal is to minimize the loss so that the neural network can accurately predict the next set of data. When we plot our losses for all possible weights, we have to ensure our loss is at its minimum value. In order to find the minimum value loss, we need to look for the minima of the function, or where the slope is 0. The optimizer updates the weights by using a learning rate, which we use to change the weight accordingly. Four Step ProcessNeural networks can be used to solve many many classification or regression problems such as Image Classification, character recognition, and natural language processing. But how do you implement a neural network to solve a problem? These are the 4 steps:
<
>
To build a neural network, we first set the whole architecture or the structure of the neural network. We first import the important libraries we need to build this network. The main library we require is the keras library, which is a neural network library in Python.
Importing important libraries...
In this step, you read in the input training dataset. To do this you use the read_csv function from the pandas library (another library in Python for datasets). We then use the Sequential model API in the keras library to build and instantiate the model. A sequential model is pretty self-explanatory; every layer only has connections to the layer coming after it. To add a layer to the model, you can use the .add(Dense()) to specify the layer. There is something called an activation function, which defines the value of a node given the weights and inputs. The relu activation function states that if the value is positive, it's just the value, else the value is 0. When we define our first layer, we also need to add our input shape, which represents how many columns or feature inputs the dataset has. The next step is to compile the model, where we specify the optimizer. The optimizer modifies the weights as throughout the training process, so that the predictions are closer to the actual target. To compile the model, you need to first specify your optimizer (there are many kinds, for example, Adam optimizer, and the SGD - Stochastic Gradient Descent optimizer) and the loss function (for example MSE - mean squared error). The loss function is a way to calculate the error between the predicted and actual values. For MSE, we basically square all of the errors of the predicted/actual values, and take the average of all of them.
Compiling model....
The third step is to fit the model with your dataset!! This is a very fun part: you basically fit the X and y dataset portions (which are the input/features and the targets respectively). This will fit the dataset into the model you have created.
Fitting the model
The last step is to predict values for your testing set. This is to test the model's accuracy with data is hasn't seen before. To do this, you would:
Predicting for testing set
Deep Learning and COVID-19Deep Learning algorithms and neural networks can help us detect infections from CT scans. This can improve the process of diagnosis, and we can also use machine learning to learn from all sorts of data: whether it be the geographically impact of COVID-19, or other external influences of the virus. Stay safe, healthy and wish y'all loads of happiness!!!! We got this! ❤️ ***During the COVID-19 pandemic, it is important to remember not to panic, but to be precautious and stay safe. Please be sure to wash your hands thoroughly and properly, and practice social distancing at this time.*** Stay safe we can fight this ❤️ Whenever I think of a linear classifier or linear machine learning model the first thing that comes to my mind is the equation y = mx + b. This equation does wonders in sooooo many different fields and applications. It is essentially the foundation of how some ML models make predictions for testing data points. What's 'classification'?Classification is when we use some form of data's characteristics to determine which group the piece of data falls into. For example detecting whether a movie review is "good" or "bad", or grouping email as "spam" or "not spam". These situations all fall into "statistical classification", where the underlying problem is to identify which group a new piece of data belongs to, by using training data to learn the pattern. Linear ClassifiersLinear classifiers use classification on a linear function of inputs. A binary linear classifier uses classification on a linear function and identifies which of the two groups a new observation belongs to. Note: that binary classifiers only deal with two targets or 'groups'. So how would we collect the input data or 'image', (as shown in the previous example above)? When dealing with classification, our data consists of input dimensions or features. It also consists of target variables, which represents the 'end result'. In binary classification, the 'target' variable (or the end result) can be 1 of 2 values (0 or 1, true or false..etc - a binary valued target). An example can be to implement a medical diagnosis system, where you predict whether a certain patient might carry an infection. The input data consists of the patient's history, and symptoms, which are the features, and the target is whether they are carrying the infection or not. A classifier acts as a decision boundary, where one side corresponds to one input, and the other side corresponds to the other input. Inputs and Weights and BiasesA binary target value has two different possibilities, or "classes", which represents the end result (for ex: whether the email was spam, or not spam). In a machine learning model, the training dataset consists of the feature variables and the corresponding target values for those variables. This is for the data we already know about, or have with us. We are going to use this data to predict the target variable for other 'not seen' cases. Hope this makes sense because it does get kind of tricky! So how are the predictions for the unseen data computed from this training data? The model basically computes a linear function (like Y = MX + B) based on something called weights and biases. The model then checks whether the output of the function is greater than or less than a constant threshold, lets call it r. So this is essentially the raw model output: raw model output = coefficients • features + interceptSo the raw model output is basically the dot product of the 'coefficients' (which are the weights) and the feature variables, and an intercept (which is the bias). The weights essentially represent the importance of a particular feature variable in a model. For instance a particular symptom for an infection might be a very critical factor as to whether the patient is infected or not, so it has a greater weight associated with it. So in binary linear classification, if the model output is less than the threshold, it is equivalent to one class; if it is greater than the threshold the data point belongs to the other class. In this example, the threshold value is 0. Therefore, if the model output is a positive value, it is predicted as one class, and if the output is negative, the model predicts it as the other class. The linear classifier is the decision boundary (which is the line). Along the line, the outputs are 0. If the intercept changes, the line's orientation also changes (so does the data value points). If the weights or coefficients of the linear function change, the line's slope value and shape change as well. There are lots lots and lots of applications that are correlated with linear classification. And there are also other models which use multi class linear classification, instead of with just 2 classes.
SQL is a very organized, functional but tricky language. It is tricky because you have to logically think about which things or variables need to be factored in in order to smartly obtain important data from a table/database. I learnt a lot in one of my projects about handling some of the functions used in SQL and how they are used so that we can query a dataset with relevant data. It does get tricky, lets see some other commands used in this very interesting language: JOINS!There are some basic commands in SQL about having to retrieve only some number of cols of the table, or the entire table, or just some records (or rows) of the table, and these commands are: SELECT, FROM, WHERE, HAVING, ORDER BY. But what if you have more than one table? And you want to find out what the relationship between these two table's are? For example, lets say you have two tables: the customers table and a person table. If you want to filter out the commonalities between these two tables (which records consist of the same individual, for example), you could use Joins. If two tables had a common column or variable, you can merge these two tables together to gather valuable information. So the table to the left is the Customers database, and the table to the right represents the Persons database. Lets say the problem was to find out which individuals have been a customer to a particular store. How would you go about solving this? The common column or key between this two tables is the 'CustomerID'/'PersonID' column. We can use this column to essentially 'match' the tables, and collect valuable info as to the people who have shopped in a particular store. So to query the common keys between the tables is to use the "INNER JOIN" command. The INNER JOIN outputs the common records which share the same ID/key value between two or more tables. So using the INNER JOIN command here would give you: Querying the data
The Inner Join deals with finding the common keys between tables, but there are many many more JOIN types! (Such as Left Join, Right Join, Self Joins, Anti Join, Semi Join..etc. It is absolutely an endless but exciting world of SQL.
Ok so today my professor decided to tell us a Halloween joke, and I thought it was the funniest thing ever: "What is the circumference of a jack-o-lantern divided by its diameter??" Pumpkin pi !!!!!!!!!!!! lol, i thought it was funny!!!!!!!!!!!! (its ok if you didn't) :) happy halloween lol i didn't really dress up like anything because I barely even have time to fully get ready in the morning, but if "a student who is heading over to class to take her exam in matrices" counts as a costume, then yeah sure, I did "dress up". Ok so I was reading this article a while ago about implementing AI algorithms in hardware chips, and I was like WOW?!!! That sounds VERY VERY cool and interesting, and can be valuable and have a positive impact in so many other industries. So recently, deep learning and machine learning have become pretty big things. This is because it can be used in practically any context. However, it also requires an intensive amount of algorithms, and complex processes. We need efficient tools to accelerate the tasks of Artificial Intelligence. But what is deep learning, machine learning, artificial intelligence? I feel like these are like the main words everyone using lol, but we need to understand why we use these types of "learnings". We use ML (abbreviation for machine learning cuz I've already used it 100 times lol) to make a machine intelligent without explicitly programming it. intro to machine learning, deep learningSo lets start with machine learning. It is one of the new technologies that is out right now. If you haven't heard of the term, thats totally cool! It is a pretty new thing, and many industries (not even tech related btw) are using machine learning to efficiently do human tasks. Machine Learning is a branch of study all about training a machine (computer for example) to do complete tasks without explicitly programming it. Image classification is an excellent example to explain machine learning. If you want a computer to classify a specific image as a cat, you would train your computer to learn certain features of the cat that are distinguishable from another animal. Another example is being able to detect if your email is spam or not. So you basically need to feed in large amounts of data to your machine learning model for it to learn patterns from that data, and accurately predict future datasets. This requires lots of algorithms and processes, which is where deep learning comes into play. Deep Learning uses an algorithm called neural networks to process, classify and make predictions on data sets. In order to get accurate results, you need LOTS of data. When you need more datasets, its gonna take a longer time to efficiently analyze the data. This is where accelerating the 'analysis' of data comes into play: and hardware processors can take care of that. Making a processor/microcontroller do many "intelligent" things sounds like such an amazing feat. In order to run such complex neural networks and process so much data and information, you need powerful and efficient processors. But what processors do we use? Which ones are the most efficient? what is a CPU?A CPU is basically the brain of a computer: the Central Processing Unit. It essentially executes and performs all of the instructions (in a program, software, application..etc.) such as logical operations, arithmetic, and I/O (input/output - communication between devices). A long time ago, CPUs were built with one core - which means they could only perform one task at a time. However, due to advancements in technology, we can now build multi-core CPUs - which means they are able to perform couple more tasks at a time. CPUs vs. GPUsA GPU is called a Graphical Processing Unit, and its designed differently compared to the CPU. A GPU has many, many more cores, and they are much smaller compared to the ones in the CPU. The cores are designed like this so that parallel, but simple computation (since the cores are smaller) can be performed, and many tasks can my completed simultaneously. GPUs are used a lot in the gaming industry, for image processing and computer graphics (hence the term "Graphics Processing Unit"). In general, the design of the GPU makes algorithms more efficient, compared to the design of a CPU. A great example of the development of GPUs is NVIDIA's platform called CUDA! NVIDIA's CUDA computing platform uses GPUs to make algorithms and computing more efficient and fast for developers. ....and which one is better?This totally totally depends on what situation you are working on. If you are working with deep learning and ML models, chances are that GPUs are a better fit. This is because ML requires a lot of matrix math related calculations, which can be really effective if done in parallel. CPUs are better for more complex but sequential, step by step math or logical problems. Also, there is a huge cost factor too. GPUs generally cost more than CPUs, so there are multiple factors to consider before coming to a decision.
A lot of people say "S..Q..L", by separating the letters or some pronounce it as "sequel". It's up to you, I like "sequel", it just sounds cooler. SQL is a superrrrrrrrrrr important skill to learn when dealing with huge amounts of data sets. SQL, which stands for Structured Query Language, is a programming language used for retrieving data and manipulating the data models (aka in table format). To specify that, it is used to extract data from a relational database. A relational database is basically a bunch of tables put together. This skill is very useful for applications in machine learning, when you are dealing with a vast amount of data sets and you need a way to extract the important information so you can predict things with future data. In a table, the rows are the records of the database, and the columns are the fields or parameters of the data. So for example, if we were dealing with a table of movies, then the fields of a movie can be 'genre', 'rating' and 'duration'. Every row, or record, would consist of a single movie. The 'SELECT' and 'FROM' commandsJust like any other programming language, SQL comes with certain syntax. The syntax is used to gather specific information from a huge table, so you can process the most important and valuable information you need. The SELECT and FROM commands are one of the foundational commands of SQL, used to extract the columns of a table. The commands in SQL can be in uppercase or lowercase, but most people prefer uppercase since its easier to distinguish from the column and table names. For instance, if we had a table featuring regular customers in a store, and we want to extract the name column of the customers table, we would write: SQL Tutorial 1
Suppose our customer database kinda looks like the one above. The above commands are basically focusing on the 'name' column in the table, highlighted in orange. What if you also wanted to take in the age aspect of the customers into consideration? If you want to select more than one column of a table, you simply have to add a comma to the SELECT statement and type in the additional column right next to it. Also, at the end of every SQL statement, we have to add a semicolon for the compiler to understand that the SQL commands are over. If you want to select all of the columns of the table (so you are essentially selecting the whole table), you need to type in the asterisk symbol next to SELECT like this: ' WHERE', 'LIKE' and 'AND' commandswhat if you want to make your commands even more specific and extract those details from large datasets? What if you want to get the names of the customers that start with the letter A? Or what if you wanted to get the customers who are 18-25 years old? To filter a dataset, you can use the WHERE command. The 'WHERE' command allows you to filter the data in order to get the information you need. So, if you want to get the names of the customers who are in 18-25 age range, you first select all of the column 'name' from the customers table. Then, you use WHERE to get the names of the customers who are 18 years old or older. But we have a boundary. We need to get the customers who are younger than 25 AND older than 18. This is where the AND command comes in. The AND command allows you to combine conditions of a particular field. So since we need to find customers who are older than 18 but younger than 25, we can use the AND command to restrict the range of the age. To get the customers who are 18-25 years old
That's a wrap for SQL! These are like the simple, intro commands; there are lots more!!!!!!!!!!!!!!!!!
Machine learning is one of the most interesting and advancing aspects of engineering and computer science. It shows you the potential of machines in our modern day world. Machine learning is essentially making a machine learn to perform different tasks by feeding enormous amounts of data. To accomplish this, there are multiple algorithms used for machines to learn and adapt to perform a task easier with new inputs. One of the algorithms is called neural networks. A Neural Network (usually abbreviated as NN) is a learning algorithm that is very similar to a biological neural network. Before going into depth of a neural network (more in depth posts coming up), lets take a look at what kinds of machine learning problems there are and their classifications:
An example of Supervised learning is segregating spam and not spam mail. The computer takes in bunch of mail, and uses a machine learning algorithm to assign the mail one output; either spam, or not spam. The machine does this by learning from previous examples of inputs and it grasps the algorithm or pattern to use for future inputs.
Artificial Intelligence has its own way of impressing us with its potential and wonders. It is an incredibly growing field and will definitely have an immense effect on our lives. In my opinion, automating manual labor and work sounds like an amazing idea; people wouldn't need to bother about whether there is going to be traffic in their route to work, or you wouldn't have to review through spelling/grammar errors in your lengthly research paper for English. AI does it all for you. Which is what makes this field sooooo interesting. And amazing.
However, when something comes with this many advantages, there is always critical side to it as well. There are many people who say that AI will take over the world and "replace" human jobs, which will lead to laying off many people in the workplace. A lot of people assume that AI will eventually replace human intelligence. Whenever I hear that, I always wonder about when the "eventually" will come. AI has just started to grow out recently, and there is something we, as humans have that machines don't know anything about. That is, emotion and psychological intelligence. Yes, maybe in about 10-20 years from now, humans won't be needed for manual labor in factories, but machines don't understand emotion. That is something we have and we can use to overpower the domination of machines. However, that is a long way from now. Like I said, AI has just started to grow out. We need such automation in our lives to make simple jobs in our lives easier. For example, Google uses location enabling to find the fastest route to work when you're a couple minutes late because you slept in (that's me for school). In fact, Google has also implemented apps to enable smart message reply, which will automatically suggest replies based on the email in your inbox. Your essay reader, TurnItIn goes through to check for plagiarism. These examples demonstrate the expanding growth of AI around the world. Good luck! Aarushi Ramesh |
Archives
December 2020
Topics
All
|