[<< wikipedia] Extremum estimator
In statistics and econometrics, extremum estimators are a wide class of estimators for parametric models that are calculated through maximization (or minimization) of a certain objective function, which depends on the data. The general theory of extremum estimators was developed by Amemiya (1985).

== Definition ==
An estimator

θ
^

{\displaystyle \scriptstyle {\hat {\theta }}}
is called an extremum estimator, if there is an objective function

Q
^

n

{\displaystyle \scriptstyle {\hat {Q}}_{n}}
such that

θ
^

=

a
r
g

m
a
x

θ
∈
Θ

Q
^

n

(
θ
)
,

{\displaystyle {\hat {\theta }}={\underset {\theta \in \Theta }{\operatorname {arg\;max} }}\ {\widehat {Q}}_{n}(\theta ),}
where Θ is the parameter space. Sometimes a slightly weaker definition is given:

Q
^

n

(

θ
^

)
≥

max

θ
∈
Θ

Q
^

n

(
θ
)
−

o

p

(
1
)
,

{\displaystyle {\widehat {Q}}_{n}({\hat {\theta }})\geq \max _{\theta \in \Theta }\,{\widehat {Q}}_{n}(\theta )-o_{p}(1),}
where op(1) is the variable converging in probability to zero. With this modification

θ
^

{\displaystyle \scriptstyle {\hat {\theta }}}
doesn't have to be the exact maximizer of the objective function, just be sufficiently close to it.
The theory of extremum estimators does not specify what the objective function should be. There are various types of objective functions suitable for different models, and this framework allows us to analyse the theoretical properties of such estimators from a unified perspective. The theory only specifies the properties that the objective function has to possess, and when one selects a particular objective function, he or she only has to verify that those properties are satisfied.

== Consistency ==

If the parameter space Θ is compact and there is a limiting function Q0(θ) such that:

Q
^

n

(
θ
)

{\displaystyle \scriptstyle {\hat {Q}}_{n}(\theta )}
converges to Q0(θ) in probability uniformly over Θ, and the function Q0(θ) is continuous and has a unique maximum at θ = θ0. If these conditions are satisfied then

θ
^

{\displaystyle \scriptstyle {\hat {\theta }}}
is consistent for θ0.The uniform convergence in probability of

Q
^

n

(
θ
)

{\displaystyle \scriptstyle {\hat {Q}}_{n}(\theta )}
means that

sup

θ
∈
Θ

|

Q
^

n

(
θ
)
−

Q

0

(
θ
)

|

→

p

0.

{\displaystyle \sup _{\theta \in \Theta }{\big |}{\hat {Q}}_{n}(\theta )-Q_{0}(\theta ){\big |}\ {\xrightarrow {p}}\ 0.}
The requirement for Θ to be compact can be replaced with a weaker assumption that the maximum of Q0 was well-separated, that is there should not exist any points θ that are distant from θ0 but such that Q0(θ) were close  to Q0(θ0). Formally, it means that for any sequence {θi} such that Q0(θi) → Q0(θ0), it should be true that θi → θ0.

== Asymptotic normality ==
Assuming that consistency has been established and the derivatives of the sample

Q

n

{\displaystyle Q_{n}}
satisfy some other conditions, the extremum estimator converges to an asymptotically Normal distribution

== Examples ==

Newey, Whitney K.; McFadden, Daniel (1994). "Large sample estimation and hypothesis testing". Handbook of Econometrics. IV. Elsevier Science. pp. 2111–2245. doi:10.1016/S1573-4412(05)80005-4. ISBN 0-444-88766-0.