A General Projection Gradient Method for Linear Constrained Optimization with Superlinear Convergence

INTRODUCTION

We consider the following linear constrained optimization:

(1)

where:

Since the gradient projection method was proposed by Rosen (1960), it become one of basic methods to solve nonlinear programming, and some authors were absorbed in research on this method (Zhang, 1979; Jian and Zhang, 1999). However, being lack of the information of twice derivatives, this type of methods converges slowly.

In order to quicken the rate of convergence, recently, it is arisen some improved algorithms (Han, 1976; Panier and Tits, 1987; Facchinel and Lucidi, 1995). In two references (Shi, 1996; Zhang and Wang, 1999) by generalizing the conjugate projection from the gradient projection, a new projection variable metric algorithm is presented which is combined the penalty function method with the variable metric algorithm. While, even under some strong conditions (for example, the sequence {x^k} converges to the optimum solution {u^k} and the corresponding multiplier vector sequence {u^k} converges to the optimum multipliers u^*), it is only proved that the sequence {x^k, u^k} converges superlinearly to (x^*, u^*), instead of the sequence {x^k} itself.

In this study, by taking advantage of the projection gradient technique, a new general projection gradient method is present to improve those methods (Shi, 1996; Zhang and Wang, 1999). Under some weaker suitable conditions, it is proved that the sequence {x^k} generated by the algorithm is superlinear convergent to the optimum solution x^*.

DESCRIPTION OF ALGORITHM

The following assumptions are true throughout the study.

H 1: The feasible set X ≠ Ø and the function f₀(x) is twice differentiable;

H 2: ∀x ∈ X vectors {a_j, j ∈ I(x)∪ E} are linear independent.

Definition 1: The function μ(x): Rⁿ → R^m is called a multiplier function, if μ(x) is continuous and μ(x^*) is the corresponding K-T multiplier vector for the K-T point x^* of (1).

For the current approximate solution x^k ∈ X, σ_k>0, a positive definite matrix B_k = B(x^k), the set L_k ⊆ IUE, we define:

(2)

(3)

Consider the following auxiliary problem:

(4)

Where

define the directional derivative along dat xas follows:

It is easy to see that

The following algorithm is proposed for solving problem (1):

Algorithm A

Step 0:

Step 1: Let i = 0, σ_k,i = σ₀

Step 2: If det and go to Step 4, otherwise, go to Step 3,where

Step 3: Let i = i + 1, σ_k,i = 1/2σ_k,i-1, go to step 2.

Step 4: Compute

Step 5: Compute d^k₀. If d^k₀ = 0, STOP; Otherwise, compute d^k₀. If

(5)

go to Step 6, otherwise goto Step 7;

Step 6: Let λ= 1.
1) If

(6)

set λ_k = λ, go to Step 8, otherwise go to 2).
2) Let λ = 1/2λ. If λ<∈, go to Step 7, otherwise go to 1) of Step 6.

Step 7: Compute

(8)

Find out β_k, the first number β in the sequence {1, 1/2, 1/4, ...} satisfying

(9)

(10)

Set d^k = q^k, λ_k = β_k.

Step 8: Obtain B_k+1 by updating the positive definite matrix B_k using some quasi-Newton formulas. Set

Set k = k + 1. Go back to Step 1.

CONVERGENCE OF ALGORITHM

Here, firstly, it is shown that Algorithm A is well defined.

H 3: The sequence {x^k} is bounded, and the sequence {B_k} is positive definite.

Lemma 1: For any iteration, there is no infinite cycle between Step 1 and Step 3. Moreover, if then there exits a constant , such that for k ∈ K, k large enough.

Proof: The proof is refereed to Lemma 1 in reference (Zhu et al., 2003).

Theorem 1: ∀k, if d^k₀ = 0, then x^k is a K-T point of (1), else, it holds that

(11)

Proof: Firstly, it is easy to see that

If d^k₀ = 0, it holds that

0 = A^T_kd^k₀ = V^k, P_k∇f₀ (x^k) = 0

So, form (2), (3) and H 3, we have

which shows that x^k is a K-T point of (1).

If d^k₀ = 0, form the definition of V^k, it is obvious that

So, it holds that

Form the definition of c_k, it holds that DG_ck (x^k , d^k₀) < 0 since A^T_kd^k₀ = V^k, it holds that

So, we have

The conclusion holds.

Lemma 2: There exists a constant k₀, such that .

In the sequel , we always assume that c_k ≡ c.

Theorem 2: The algorithm either stops at the K-T point x^k of the problem (1) in finite iteration, or generates an infinite sequence {x^k}, any accumulation point x^* of which is a K-T point of the problem (1).

Proof: The first statement is obvious, the only stopping point being step 5. Thus, suppose that {x^k}_k∈K → x^*, d^k₀ → 0, k ∈ K. From (5), (6), (9) and Theorem 1, it is easy to see that {G_c (x^k)} is decreasing. So, it holds that

(12)

If there exists K₁ ⊆ K (|K₁| = ∞), such that for all k ∈ K₁, x^k+1 = x^k + λ_kd^k is generated by step 6 and step 8, then from (5), (6), we get

So, d^k₀ → 0, k ∈ K₁ since d^k₀ → d^*₀, k ∈ K it is clear that d^*₀ = 0 i.e., d^k₀ → 0, k ∈ K. So, according to Theorem 1, it is obvious that x^k is a K-T point of (1).

Now, we might as well assume that, for all k ∈ K, x^k+1 = x^k + λ_kd^k is generated for by step 7 and step 8, Suppose that the desired conclusion is false, i.e., d^k₀ ≠ 0. Imitating the proof of Theorem 1,we have DG_c (x^*, q^*)<0 and we can conclude that the step-size β_k obtained by the linear search in step 7 is bounded away from zero on K, i.e.,

β_k≥β_* = inf {β_k, k ∈ K}>0, k ∈ K

So, from (9) and Theorem 1, it holds that

It is a contradiction, which shows that d^k₀ → 0, k ∈ K, k → ∞ . So, according to Theorem 1. It is easy to see that x^k is a K-T point of (1).

In order to obtain superlinear convergence, we also make the following additional assumptions.

H 4: The sequence generated by the algorithm possesses an accumulation point x^*.

H 5: B_k → B_*, k → ∞.

H 6: The second-order sufficiency conditions with strict complementary slackness are satisfied at the K-T point x^* and the corresponding multiplier vector u^*.

According to Lemma 5 in reference (Zhu, 2005), we have the following conclusion.

Lemma 3: The entire sequence {x^k} converges to x^*, i.e., x^k → x^*, k → ∞ and for k large enough , it holds that

Lemma 4: Denote é^k = π^k + (A^T_kB^-1_k A_k)^-1 F(x^k). Under above mentioned conditions, for k large enough , it holds that

∇f₀(x^k) + B_kd^k₀ + A_ké^k = 0, F(x^k) + A^T_kd^k₀

Proof: According to Lemma 3 and H 6, it holds, for k large enough, that π^k>0. So, from the definition of V^k, we have

A^T_kd^k₀ = V^k = -F(x^k), F(x^k) + A^T_kd^k₀ = 0

While,

The conclusion holds.

Lemma 5: For k large enough, there exists a constant b>0, such that

Proof : Since x^k → x^*and for k large enough, L_k ≡ I(x^* U) E, it holds that

Thereby, there exists some η>0, such that

while, from Lemma 4 it holds that

In addition, it holds, for k large enough, that π^k_j>0, A^T_kd^k₀ = V^k = - f^j(x^k), j ∈ L_k. So

From τ (2, 3), we have ||d^k|| ~ ||d^k₀||, ||d^k₁|| = o(||d^k₀||²).

In order to obtain superlinear convergence, a crucial requirement is that a unit step size be used in a neighborhood of the solution. This can be achieved if the following assumption is satisfied.

H 7: Let

where

In view of Lemma 4, imitating the proof of Lemma 4.4 in reference (Zhu, 2005), it is easy to obtain the following conclusion.

Lemma 6: For k large enough, step 7 is no longer performed in the algorithm and the attempted search in step 6 is successful in every iteration, i.e., λ_k ≡ 1, x^k+1 = x^k + d^k.

Moreover, in view of Lemma 3.8 and the way of Theorem 2 in reference (Panier and Tits, 1987), we may obtain the following theorem:

Theorem 3: Under all above-mentioned assumptions, the algorithm is superlinearly convergent, i.e., the sequence {x^k} generated by the algorithm satisfies

ACKNOWLEDGMENT

This study was supported in part by the NNSF (10501009, 60471039) of China.

HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2006 | Volume: 6 | Issue: 5 | Page No.: 1085-1089
DOI: 10.3923/jas.2006.1085.1089

A General Projection Gradient Method for Linear Constrained Optimization with Superlinear Convergence

Zhibin Zhu and Binliang Zhang

How to cite this article

Zhibin Zhu and Binliang Zhang, 2006. A General Projection Gradient Method for Linear Constrained Optimization with Superlinear Convergence. Journal of Applied Sciences, 6: 1085-1089.

Keywords: superlinear convergence, Linear constrained optimization, conjugate projection gradient algorithm and global convergence

REFERENCES

HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2006 | Volume: 6 | Issue: 5 | Page No.: 1085-1089 DOI: 10.3923/jas.2006.1085.1089

A General Projection Gradient Method for Linear Constrained Optimization with Superlinear Convergence

Zhibin Zhu and Binliang Zhang

How to cite this article

Zhibin Zhu and Binliang Zhang, 2006. A General Projection Gradient Method for Linear Constrained Optimization with Superlinear Convergence. Journal of Applied Sciences, 6: 1085-1089.

Keywords: superlinear convergence, Linear constrained optimization, conjugate projection gradient algorithm and global convergence

REFERENCES

Year: 2006 | Volume: 6 | Issue: 5 | Page No.: 1085-1089
DOI: 10.3923/jas.2006.1085.1089