Which NumPy function is used to generate samples from a Zipf distribution?
Analysis & Theory
`np.random.zipf()` generates random samples from the Zipf distribution.
What does the parameter `a` represent in `np.random.zipf(a)`?
A
Mean of the distribution
D
Exponent (power-law) parameter
Analysis & Theory
`a` is the exponent parameter of the Zipf distribution, controlling how steeply the probabilities decay.
What is the domain of values returned by `np.random.zipf(a)`?
B
Positive integers starting from 1
Analysis & Theory
Zipf distribution returns positive integers ≥ 1.
What happens if `a <= 1` in `np.random.zipf(a)`?
B
Raises a ValueError or unstable results
D
Returns floating point values
Analysis & Theory
For `a <= 1`, the Zipf distribution becomes undefined or unstable because the harmonic series diverges.
What type of distribution is Zipf’s law often used to model?
B
Frequency of words in natural language
Analysis & Theory
Zipf's law is often used to model word frequencies, where the most frequent word is exponentially more common than others.
Which of the following values of `a` will result in a very steep distribution (most values are 1)?
Analysis & Theory
Higher `a` values make the distribution more steep, with more frequent 1s and fewer large values.
What is the shape of the result from `np.random.zipf(2.0, size=(2, 3))`?
Analysis & Theory
`size=(2, 3)` generates a 2x3 array of Zipf-distributed integers.
How can you ensure reproducibility when using `np.random.zipf()`?
A
Use `np.random.seed()` before calling the function
B
Use `np.set_zipf_seed()`
Analysis & Theory
Using `np.random.seed()` sets the seed, making random results reproducible.
What does the distribution look like for `np.random.zipf(1.0001, size=10000)`?
C
Heavily skewed toward 1
D
Centered around the mean
Analysis & Theory
With `a` just above 1, the distribution is extremely skewed, returning 1 very frequently.
Which field frequently uses Zipf distributions for analysis?
B
Natural language processing
Analysis & Theory
Zipf distributions are widely used in NLP to analyze word frequencies in large corpora.