
Data Analytics sppu mcq
1. ——– function is used to add a title to each axis instance in a figure.
A : set_title()
B : get_title()
C : set_label()
D : title()
set_title()
2. ———- provides arange of supervised and un-supervised learning
algorithms via consistant interface in python
A : Pandas
B : Numpy
C : Scikit-Learn
D : image
Scikit-Learn
3. The ———- attribute specifies the number of dimensions or axes of the
array.
A : ndarray.size
B : ndarray.dtype
C : ndarray.ndim
D : ndarray.axes
ndarray.ndim
4.
The ———– algorithm is based on the fact that the algorithm uses prior
knowledge to find frequent item set.
A : Clustring
B : Regression
C : Naïve Bays
D : Apriori
Apriori
5. —————- submodule of scipy is dedicated to image processing
A : ndarray
B : spatial
C : ndimage
D : special
ndimage
6. If number of input features are 3 then optimal hyperplane in support
vector machine is ————-
A : Single point
B : Line
C : 2-D Plane
D : Non linear line
2-D Plane
7. ————— is an example of human generated unstructured data.
A : Text files
B : Satellite data
C : Sensor data
D : Seismic imagery data
Text files
8. ————- must be installed before you use scikit learn
A : Matlab
B : Scilab
C : Scipy
D : Numpy
Scipy
9. The procedure to organize items of a given collection into groups based on
some similar features called as ————-
A : Regression
B : Clustering
C : Ddecion Trees
D : Association
Clustering
10. In statistics, a population consists of ——————
A : All People living in a country.
B : All People living in the city.
C : All subjects or objects whose characteristics are being studied.
D : Part of whole dataset
All subjects or objects whose characteristics are being studied.
Data analytics mcq with answers
11. Which function is used to give title for the axes.
A : plt.title()
B : plt.xlabel()
C : plt.ylabel()
D : plt.xscale()
plt.title()
12. ————- function is used to plot a histogram using matplotlib library
A : hist()
B : bar()
C : pie()
hist()
13. Which of the following is measure used in decision trees while selecting
splliting criteria that partitions data into the best possible manner.
A : Probability
B : Gini Index
C : Regression
D : Association
Gini Index
14. Email data is an example of ———-
A : Structured data
B : Un-Structured data
C : Semi-Structured data
D : Scattered
Un-Structured data
15. Which of the following is not a type of clustering algorithm?
A : Density clustering
B : K-Mean clustering
C : Centroid clustering
D : Simple clustering
Simple clustering
16. —— answers the questions like ” How can we make it happen?”
A : Descriptive
B : Prescriptive
C : Predictive
D : Probability
Prescriptive
17. ————– data does not fits into a data model due to variatins in contents
A : Structured data
B : Un-Structured data
C : Semi-Structured data
Un-Structured data
18. ————— function multiply two matrices in numpy.
A : prod()
B : mult()
C : dot()
D : *
dot()
19. ——————– is a general purpose array-processing package provides a
high performance multi-dimentional array object and tools for working with
these arrays
A : NumPy
B : SciPy
C : sklearn
D : None of these
NumPy
20. ——– library is built on the top of Numpy, SciPy and Matplotlib
A : Sympy
B : Scikit
C : Pandas
D : Numpy
Scikit
Data analytics mcq questions and answers
21. The last element of ndarray is indexed by ————-
A : 0
B : -1
C : 1
D : -2
-1
22. ————the step is performed by data scientist after acquiring the data.
A : Data Cleansing
B : Data Integration
C : Data Replication
D : Data loading
Data Cleansing
23. ———— function is used to save an array as in image file.
A : matplotlib.pyplot.image()
B : matplotlib.pyplot.imread()
C : matplotlib.pyplot.imwrite()
D : matplotlib.pyplot.imsave()
matplotlib.pyplot.imsave()
24. ————- is unsupervised machine learning technique.
A : KNN
B : Support Vector Machines
C : Decision trees
D : Cluster analysis
Cluster analysis
25. What is correct syntax to generate inetegers between 10 to 30
A : x=numpy.arange(10,30)
B : x=numpy.array(10,30)
C : x=numpy.arange(10,31)
D : x=arange(10,31)
x=numpy.arange(10,31)
Big data analytics mcq
26. _——— function used to get arrays elementwise remainder of division
A : numpy.divide(x1,x2)
B : numpy.mod(x1,x2)
C : numpy.true_divide(x1,x2)
D : numpy.reminder(x1,x2)
numpy.mod(x1,x2)
27. ———– is an indication of how often the rule has been found to be true in
association rule mining.
A : Confidence
B : Support
C : Lift
D : None of These
Confidence
28. A ———— is a supervised machine learning algorithm which relies on the
assumptiion of feature independent to classify input data
A : Clustring
B : Regression
C : Naïve Bays
D : Apriori
Naïve Bays
29. What is the use of following function? Plt.xlabel(“Total Marks”)
A : Gives label to X-Axis
B : Gives label to Y-Axis
C : Gives title to figure
D : Add text to figure
Gives label to X-Axis
30. Pandas provide ———– function as the entry point for all standard
database join operations while merging two DataFrame objects.
A : concat()
B : replace()
C : merge()
D : add()
merge()
31. Data generated on twitter is an example of ———
A : Structured data
B : Un-Structured data
C : Semi-Structured data
D : Scattered
Un-Structured data
32. —————– is an excellent 2D and 3D graphics library for generating
scientific figures?
A : Pandas
B : Numpy
C : matplotlib
D : ndarray
matplotlib
33. Support(B) =
A : (Transacions containing (B)) / (Total Transactions)
B : (Transacions containing (B)) / 100
C : (Total Transactions) / (Transacions containing (B))
D : 100/ (Transacions containing (B))
(Transacions containing (B)) / (Total Transactions)
34. ———— is an example of semi structured data
A : NoSQL data
B : YouTube data
C : Text File data
D : Satellite imagery data
NoSQL data
35. ——————— is raster graphic format with lossless compression.
A : EPS
B : PDF
C : PNG
D : PS
PNG
36. ——————is a flow-chart like tree structure, where each internal node
denotes a test on an attribute, each branch represents an outcome of the test, and
leaf nodes represent classes or class distributions.
A : Decision tree
B : Association Rule Mining
C : Clustering
D : Support vector machines
Decision tree
37. ——————— is a form of supervised learning algorithm which is used in
mail service providers like Gmail, yahoo, etc. to classify a new mail as spam or not spam
A : Classification
B : Regression
C : Clustering
D : Naïve bays
Classification
38. In ———— the x-axes are grouped into bins and each bin will be treated
as a category
A : Bar
B : Line
C : Scatter
D : Histogram
Histogram
39. When data are collected in a statistical study for only a portion or subset
of all elements of interest we are using
A : Sample
B : Parameter
C : Population
D : Probability
Sample
40. ———— regression finds a relaitionship between one or more features
(independent variables) and a continuous variables (dependent variable).
A : Non-linear
B : Linear
C : Both of these
D : None of These
Linear
41. It is a measure of disorder or purity or unpredictability or uncertainty.
A : Entropy
B : Support
C : Confidence
D : lift
Entropy
42. Which of the following function is not used to iterate over the rows of the
DataFrame.
A : iteritems()
B : iterrows()
C : itertuples()
D : iterpanel()
iterpanel()
43. ——— is technique that duplicates smaller array to make dimensionality
and size of an array as the size and dimensionality of larger array
A : Multiplation
B : Broadcasting
C : Addition
D : Flatten
Broadcasting
44. Which of the following task is not performed by Data Scientist.
A : Define the question
B : Create reproducible code
C : Challenge results
D : Staff Recruitemen
Staff Recruitemen
45. To save a figure into a file we can use ———— method in the figure class
of matplotlib.pyplot.
A : save()
B : save_fig()
C : Figure()
D : save_image()
save_fig()
46. ———- machine learning algorithm used in cross marketing to work with
other businesss that complement your own business but not to other competitors.
A : Decision tree
B : Association Rule Mining
C : Clustering
D : Support vector machine
B : Association Rule Mining
47. Which function returns an ndarray object that contains the numbers that
are evenly spaced on a log scale.
A : numpy.logspace()
B : numpy.log()
C : numpy.fill()
D : numpy.random()
numpy.logspace()
48. The ——— argument of merge function while merging two dataframes
specifies which keys are to be included in the resulting dataframe.
A : right
B : on
C : sort
D : how
how
49. Which of the following function is used to split a figure into nrows*ncols
sub-axes.
A : plot()
B : draw()
C : bar()
D : subplot()
subplot()
50. ——– function is used to display an image through an external viewer in
scipy.
A : display()
B : imread()
C : imshow()
D : show()
show()
51. ——– is an unsupervised algorithm used for frequent itemset mining
A : Apriori
B : Support Vector Machines
C : Decision trees
D : Cluster analysis
Cluster analysis
52. The — —– is characterized by a bell shapped curve and area under curve
represents probabilities
A : Normal Distribution
B : Binomial Distribution
C : Poission Distribution
D : Probability
Normal Distribution
53. Apriori algorithm uses breadth first search and ————structure to
count candidate item sets efficiently.
A : Decision tree
B : Hash tree
C : Red-Black Tree
D : AVL Tree
Hash tree
54. In Data science project data acquisition step involves——–
A : Acquiring data from various sources.
B : Selecting dataset
C : Data preprocessing
D : Data modeling
Acquiring data from various sources.
55. Select the correct statement:
A : Raw data is original source of data.
B : Preprocessed data is original source of data.
C : Raw data is the data obtained after processing steps.
D : Analysed data is original source of data
Raw data is original source of data.
56. Which of the following statement will create an axes at the top right
corner of the current figure
A : subplot(2,3,3)
B : subplot(2,3,2)
C : subplot(2,3,4)
D : subplot(2,3,5)
subplot(2,3,3)
57. Catelog design is complex process where the selection of items in a
business’s catelog are often designed to complement each other so that buying
one item will lead to buying of another. So these items are often complements or
very related. Which algorith
A : Decision tree
B : Association Rule Mining
C : Clustering
D : Support vector machine
Association Rule Mining
58. While plotting using matplotlib.pyplot A function call similar to
subplot(2,3,4) is
A : subplot(234)
B : subplot(243)
C : subplot(324)
D : subplot(4)
subplot(234)
59. ———— algorithm models a series of logical If-Then- Else decision
statements, there is no underlying assumption of a linear or non-linear
relationship between the input variables and response variables.
A : Regression
B : Decision Trees
C : Clustering
D : Naïve bays
Decision Trees
60. To reach to the final point and to make prediction , decision trees must
be traversed from ———-
A : Top – to – bottom
B : Bottom- to – Top
C : Left- to Right
D : Right – to – Left
Top – to – bottom
Data analytics mcq sppu
data analytics mcq sppu, big data analytics mcq pdf, data analytics mcq with answers, big data analytics mcq with answers, data analytics mcq with answers pdf, data analytics sppu mcq, data analytics mcq questions, big data analytics mcq,