Do you think you know everything about data science and machine learning? You have no right. In the past few months I have interviewed many companies for entry-level tasks in the areas of data science and machine learning. But there are other important questions that we will now consider in our interview. I think the answers will help you find your dream job.

**What is data normalization and why do we need it?**

Data normalization is a very important preprocessing step that is used to re-scale values to ensure better convergence during back propagation. In general, it boils down to subtracting the mean of each data point and dividing it by its standard deviation. If we don’t do this, some functions will be weighted more heavily in the cost function. If a function with a higher strength changes by 1%, this change is quite large, but rather insignificant for smaller features. All characteristics are weighted equally by data standardization.

**What is dimension reduction?**

Dimensional reduction is a process in which the number of feature variables under consideration is reduced. The importance of a feature depends on the extent to which the feature variable contributes to the information representation of the data and the technique you want to use. The choice of technology to use depends on trial, error and preference. Reducing the dimensionality for a data set can have the following advantages: (1) Let us reduce the required storage space. (2) Let’s speed up a calculation (e.g. with machine learning algorithms). (3) Let’s remove redundant functions. It makes no sense to save the size of a site in both square meters and square miles (data collection may have been incorrect). (4) If we reduce the dimensioning of the data to 2D or 3D, we may be able to draw and visualize it. (5) Too many functions or too complex a model can lead to overfitting.

**How do you deal with missing or damaged data in a data record?**

You could find missing / damaged data in a dataset and either delete those rows or columns or replace them with another value. There are two very useful methods in Pandas: isnull () and dropna (), which you can use to find columns of data with missing or damaged data and delete those values. If you want to fill the invalid values with a placeholder value (e.g. 0), you can use the fillna () method.

**What does cluster analysis mean?**

The popular article on 5 clustering algorithms that every data scientist needs to know will explain them in great detail with great visualizations.

**How would you do an Exploratory Data Analysis (EDA)?**

The goal of an EDA is to gain some knowledge from the data before you basically want to make your EDA rough to fine. We start by getting some global insights at the highest level. Check out some unbalanced classes. Consider the mean and variance of each class. Check out the first few lines to find out what it’s about. Run a Pandas ** df.info ()**to see which features are continuous, categorical, their type (int, float, string). Next, delete unnecessary columns that are not useful for analysis and prediction. These can simply be columns that look unusable, in which many rows have the same value (ie they do not provide us with much information) or many values are missing. We can also fill in missing values with the most common value in this column or the median. Now we can create some basic visualizations. Start with high-level stuff. Run some bar charts for categorical functions with few groups. Cash plans of the final classes. Take a look at most of the “general functions”. Create some visualizations of these individual functions, to get basic insights. Now we can start to be more precise. Create visualizations between two or three features at the same time. How are functions related? You can also do a PCA to see which features contain the most information. Group some functions to see their relationships. For example, what happens to the classes when A = 0 and B = 0? How about A = 1 and B = 0? Compare different functions. For example, if Feature A can be “Female” or “Male”, you can display Feature A in the cabin they were in to see if men and women stayed in different cabins. In addition to bars, scattering and other basic representations, we can create PDF / CDF, overlaid representations, etc. Take a look at some statistics like distribution, p-value, etc. Finally, it is time to create the ML model. Start with simpler things like Naive Bayes and Linear Regression. If you find that this data is sucking or the data is highly non-linear, use polynomial regression, decision trees, or SVMs. The functions can be selected from the EDA according to their meaning. If you have a lot of data, you can use a neural network. Check the ROC curve. Precision, recall. use polynomial regression, decision trees or SVMs. The functions can be selected from the EDA according to their meaning. If you have a lot of data, you can use a neural network. Check the ROC curve. Precision, recall. use polynomial regression, decision trees or SVMs. The functions can be selected from the EDA according to their meaning. If you have a lot of data, you can use a neural network. Check the ROC curve. Precision, recall.

The main tool of exploratory data analysis is Interactive Statistical Graphics (ISG), with which it is possible to manipulate and graphically represent data records in a fast and interactive manner. Explorative data analysis shows its strengths and advantages, especially when dealing with real data sets and a context. This makes it possible to find starting points for complex problems that go beyond textbook examples or constructed examples for statistical processes. Since the modeling process and the first evaluation of the data create the most problems for the learner and the researcher, it seems to be a worthwhile starting point to use the interactive statistical graphics on real data sets in the learning area. You have to find some visualizations of these individual functions, to get basic insights. Now we can start to be more precise. Create visualizations between two or three features at the same time. How are functions related? You can also do a PCA to see which features contain the most information. Group some functions to see their relationships.

**Why do we use folding for images instead of just FC layers?**

This answer has two parts. First, you have to preserve shapes, code them and use the spatial information from the image. Second, you have to have Convolutional Neural Networks (CNNs).

**What makes the translation of CNNs immutable?**

Imagine that you are making object recognition. It doesn’t matter where the object is, but how we see the fold in a sliding window.

**Why do we have maximum pooling on CNNs?**

With max pooling in a CNN you can reduce the calculation. You don’t lose too much semantic information because you do the maximum activation. There is also a theory that max pooling does a little bit to give CNNs more invariation of translation. Check out this great video from Andrew Ng about the benefits of max pooling ( https://www.coursera.org/lecture/convolutional-neural-networks/pooling-layers-hELHk ).

**Why do segmentation CNNs have an encoder-decoder style / structure?**

The CNN encoder can basically be viewed as a feature extraction network, while the decoder uses this information to predict the image segments by “decoding” the features and upscaling the original image size.

**What is batch normalization and why does it work?**

Batch normalization works best after the activation function, It was developed to prevent an internal covariate shift. Internal covariate shift occurs when the distribution of activations in a shift shifts significantly during training. Batch normalization is used so that the distribution of the inputs (and these inputs are literally the result of an activation function) to a certain level over time is advantageously not changed due to parameter updates from each batch). It uses batch statistics to perform normalization and then uses the batch normalization parameters (gamma and beta in the original) to ensure that the transformation inserted into the network can represent the identity transformation.

**Conclusion**

What other machine learning answers would you like to get?

how to learn machine learning from scratch ?

What questions do you mostly get in the area of AI ?

The AI TechbyLight team will be happy to answer these and other questions by email or in the Q&A area.

## Leave a Reply