1. 程式人生 > 實用技巧 >貪心演算法總結

貪心演算法總結

簡介

貪心演算法(英文:greedy algorithm),是用計算機來模擬一個“貪心”的人做出決策的過程。這個人十分貪婪,每一步行動總是按某種指標選取最優的操作。而且他目光短淺,總是隻看眼前,並不考慮以後可能造成的影響。

可想而知,並不是所有的時候貪心法都能獲得最優解,所以一般使用貪心法的時候,都要確保自己能證明其正確性。

本文主要介紹,在解決諸多貪心演算法的問題之後的心得。

常用場景

最常見的貪心演算法分為兩種。

  • 「我們將 XXX 按照某某順序排序,然後按某種順序(例如從小到大)選擇。」。
  • 「我們每次都取 XXX 中最大/小的東西,並更新 XXX。」(有時「XXX 中最大/小的東西」可以優化,比如用優先佇列維護

第一種是離線的,先處理後選擇,第二種是線上的,邊處理邊選擇。

常見的出題背景為:

  • 確定某種最優組合(硬幣問題)
  • 區間問題(合理安排區間)
  • 字典序問題
  • 最值問題

\(\mathcal{A}\) 最優組合

硬幣問題是貪心演算法非常經典的題目,關於最優組合問題,我認為主要分為兩種型別:

  • 簡單 -- 直接排序之後按照某種策略選取即可
  • 複雜 -- 除了按照貪心策略外,還需要進行某些處理或者模擬

硬幣問題

硬幣問題

有1元、5元、10元、50元、100元、500元的硬幣各\(C_1、C_5、C_{10}、C_{50}、C_{100}、C_{500}\)枚。現在要用這些硬幣來支付\(A\)元,最少需要多少枚硬幣?假設本題至少存在一種支付方法。

  • \(0 \leq C_1、C_5、C_{10}、C_{50}、C_{100}、C_{500} \leq 10^9\)
  • \(0 \leq A \leq 10^9\)

本題是上述說的簡單型別的題目,簡而言之要使得硬幣最少,則優先使用大面額的硬幣

因此本題的解法便非常清晰了,只需要從後往前遍歷一遍即可(預設為硬幣已經按面額大小進行排序)

const int V[6] = {1, 5, 10, 50, 100, 500};
int A, C[6]; // input

void solve(){
	int ans(0);
	
	for (int i = 5; i >= 0; -- i){
        int t = min(A / V[i], C[i]);
        A -= t * V[i];
        ans += t;
    }
    
    cout <<  ans << '\n';
}

零花錢問題

POJ3040 Allowance

Description

As a reward for record milk production, Farmer John has decided to start paying Bessie the cow a small weekly allowance. FJ has a set of coins in \(N\) (1 <= N <= 20) different denominations, where each denomination of coin evenly divides the next-larger denomination (e.g., 1 cent coins, 5 cent coins, 10 cent coins, and 50 cent coins).Using the given set of coins, he would like to pay Bessie at least some given amount of money \(C\) (1 <= C <= 100,000,000) every week.Please help him ompute the maximum number of weeks he can pay Bessie.

Input

* Line 1: Two space-separated integers: \(N\) and \(C\)

* Lines 2..N+1: Each line corresponds to a denomination of coin and contains two integers: the value \(V\) (1 <= V <= 100,000,000) of the denomination, and the number of coins \(B\) (1 <= B <= 1,000,000) of this denomation in Farmer John's possession.

Output

* Line 1: A single integer that is the number of weeks Farmer John can pay Bessie at least C allowance

Sample Input

3  6
10 1
1  100
5  120

Sample Output

111

這題的題目大意是:農場主每天都要給貝西至少為\(C\)的津貼。FJ有\(N\)數量的硬幣,且每個硬幣都能整除比他大的所有硬幣。給定硬幣的面額\(V_i\)及其數量\(B_i\),求問這些錢最多能支付多少天的津貼。

這題和上一題比則更復雜些。因為存在幾個難點:

  • 策略從簡單的取最大的變為支付更多天,需要進行“轉義”
  • 每找到一種方案,還需要考慮各個硬幣數量的問題。

首先,我們明確知道,這個題目就是可以用貪心法去做。我們需要做的就是把目標具體化。需要支付更多天,實際上是要浪費更少的硬幣。因此我們的方案應該是當前能浪費最少的方案。(大於目標值的硬幣直接算作一天)

實際上,設臨界值為\(V_{cur}\),目標值為\(V_{tar}\),則存在 \(V_{cur} < V_{tar} , V_{cur} + V_i \ge V_{tar}\)。這時這個\(V_i\)必須是可選方案中最小的。

所以,我們可以:

  1. 先從大到小遍歷一遍找到不超過\(V_{tar}\)的臨界值。
  2. 再從小到大找到第一個可以使用的\(V_i\)
  3. 統計該方案能實現的天數,更新答案和硬幣數量。
int n, c;
vector<PLL> coins;

void solve(){
    int res = 0;
    for (int i = 0; i < n; ++ i){
        LL val, num;
        cin >> val >> num; 
        if (val >= c) res += num; // 大於直接加進去
        else coins.push_back(MP(val, num));
    }

    sort(coins.begin(), coins.end()); // 排序
    int sz = coins.size();
    while (true){
        LL tmp = c;
        vector<int> use(sz, 0);
        for (int i = sz - 1; i >= 0; -- i){ // 先大後小,在不超過cost的基礎上不斷逼近
            use[i] = min(tmp / coins[i].first, coins[i].second); // 最多分配多少個
            tmp -= coins[i].first * use[i]; // update
        }

        if (tmp){ // 假如沒有恰好分配成功
            for (int i = 0; i < sz; ++ i){ // 從小到大找到一個(為啥是一個? 因為假如能加兩個的話 之前就已經加上了,所以必定為一個
                if (coins[i].second >= use[i] + 1){ // 要是直接 if (coins[i].second) 可能出現已經不夠但還是支出的情況!
                    ++ use[i];
                    tmp -= coins[i].first;
                    break;
                }
            }
        }
        if (tmp > 0) break;

        LL mxu = 0x3f3f3f3f;
        for (int i = 0; i < sz; ++ i) { // 找到最終該方案能分配多少個
            if (use[i] == 0) continue;
            mxu = min(mxu, coins[i].second / use[i]);
        }
        // cout << mxu << endl;
        res += mxu;
        for (int i = 0; i < sz; ++ i){ // update
            coins[i].second -= (mxu * use[i]);
        }
    }

    cout << res << '\n';
}

總結一下,這類最優組合類的貪心問題,大多是離線問題。需要我們首先找到貪心的主要策略,並通過一些小模擬進行處理相對來說,這類問題是比較簡單的,只需要注意些模擬時容易出現的細節問題即可

\(\mathcal{B}\) 區間問題

貪心演算法在區間問題(如區間排程,區間安排,區間組合等)上的應用非常廣泛。通常是需要對區間按照某種標準(如right endpoint)進行排序或者利用一些資料結構(如 priority_queue)進行

大致上我將貪心類的區間問題分為兩類:

  • 一維:工作時間、行程、建築安排等
  • 二維: 有關圓,半徑等建築安排等

下面通過分別介紹兩個經典的例題進行分析:

一維區間

POJ3190 Stall Reservations

Description

Oh those picky \(N\) (1 <= N <= 50,000) cows! They are so picky that each one will only be milked over some precise time interval \(A\cdots B\) (1 <= A <= B <= 1,000,000), which includes both times \(A\) and \(B\). Obviously, FJ must create a reservation system to determine which stall each cow can be assigned for her milking time. Of course, no cow will share such a private moment with other cows.

Help FJ by determining:

  • The minimum number of stalls required in the barn so that each cow can have her private milking period
  • An assignment of cows to these stalls over time

Many answers are correct for each test dataset; a program will grade your answer.

Input

Line 1: A single integer, \(N\)

Lines 2..N+1: Line i+1 describes cow i's milking interval with two space-separated integers.

Output

Line 1: The minimum number of stalls the barn must have.

Lines 2..N+1: Line i+1 describes the stall to which cow i will be assigned for her milking period.

Sample Input

5
1 10
2 4
3 6
5 8
4 7

Sample Output

4
1
2
3
2
4

Hint

Explanation of the sample:

Here's a graphical schedule for this output:

Time     1  2  3  4  5  6  7  8  9 10
Stall 1 c1>>>>>>>>>>>>>>>>>>>>>>>>>>>
Stall 2 .. c2>>>>>> c4>>>>>>>>> .. ..
Stall 3 .. .. c3>>>>>>>>> .. .. .. ..
Stall 4 .. .. .. c5>>>>>>>>> .. .. ..

Other outputs using the same number of stalls are possible.

題目的大意為:每頭牛都有專屬自己擠奶的時間間隔\([A, B]\),且他們不願意與別人在同一時間在 同一牛棚裡擠奶,因此你要用儘可能少的牛棚保證這些奶牛能按時擠奶。

顯然這是一個與區間安排有關的問題。這很像\(\mathcal{OS}\)中的序列並行問題,也就是每個CPU在一定時間內只能執行某個任務,請給出一個排程順序使得使用的CPU個數最少。

首先我們需要考慮的問題為:最少需要多少個牛棚

其實很簡單,我們只要知道同一時間段最多有多少頭牛需要擠奶便可以知道最少需要多少個牛棚\(k\)。一方面假如牛棚數量小於\(k\),則至少有一頭牛無法安排。另一方面,不論區間具體如何,某一時間段最多有\(k\)頭牛,需要同時擠奶則\(k\)個牛棚必定夠用。

第二個問題是:如何安排牛棚呢?

實際上,思考一下程序的排程的實現便可以知曉答案。首先對於所有的區間按左端點從小到大進行排序,這樣我們模擬了一個程序佇列。我們依次將牛嘗試新增到牛棚中,假如當前仍有空閒牛棚,則直接加入;否則找到一個已經用完的牛棚使用。(這裡可以利用查分陣列)

如何找到已經用完的牛棚呢?通過優先順序佇列便可以很簡單的實現,每次分配一頭牛時,將牛所屬區間的右值(完成時間)壓入到小根堆中,因此每次排程時判斷堆頂的值是否小於當前區間的左值(最早的完成時間是否大於當前任務開始時間)。

所以這題的主要做法及思路如下:

  1. 思考如何進行貪心,優先安排開始時間早的牛,安排牛時判斷是否有已經完成的牛
  2. 思考能否利用現有資料結構解決難點:優先順序佇列
  3. 進行解題

程式碼如下:

const int MAXM = 1e5 + 50;
const int MAXN = 1e6 + 50;
int n, diff[MAXN], res[MAXM]; // cow number / diff array / answer array

struct run
{
    int flag; // 用了哪個牛棚
    pair<PII, int> node; // {{start, end}, name}
    run(): flag(-1), node({{-1, -1}, -1}){}
    run(int _flag, pair<PII, int> _node): flag(_flag), node(_node) {}
};

bool cmp1(const pair<PII, int> &a, const pair<PII, int> &b){
    return a.first < b.first;
}

bool operator< (const run &a, const run &b){
    return a.node.first.second > b.node.first.second;
}

void solve(){
    memset(diff, 0, sizeof(diff));
    memset(res, 0, sizeof(res));
    
    int l, r;
    vector< pair<PII, int> > arr; // { {left, right}, name }
    for (int i = 0; i < n; ++ i){
        cin >> l >> r;
        arr.push_back(MP(MP(l, r), i));
        ++ diff[l];
        -- diff[r + 1];
    }

    // 找到牛棚數量
    int k = 0, t = 0;
    for (int i = 0; i < MAXN; ++ i){
        t += diff[i];
        k = max(k, t);
    }

    sort(arr.begin(), arr.end(), cmp1); // 確保left小的在左邊
    priority_queue<run, vector<run>> que;  // 確保right小的在頂上
    while (!que.empty()) que.pop(); // clear 


    int cnt = 1;
    for (int i = 0; i < arr.size(); ++ i){
        // que中沒有node或者que中所有node在當前都未完成
        if (que.empty() || que.top().node.first.second >= arr[i].first.first){ 
            res[arr[i].second] = cnt;
            que.push({cnt, arr[i]});
            ++ cnt;
        }else {
            run qt = que.top(); que.pop();
            res[arr[i].second] = qt.flag;
            que.push({qt.flag, arr[i]});
        }
    }

    cout << k << endl;
    for (int i = 0; i < n; ++ i){
        cout << res[i] << endl;
    }
}

實際上觀察程式碼我們可以發現利用差分陣列求解得到的\(k\)值,在後續求解中沒有用到。實際上經過思考我們發現,可以去掉利用差分陣列的求解最大牛棚的過程,可以直接利用貪心演算法計算(也就是程式碼中的cnt),這裡之所以不去掉是因為包含了差分的思想,可能對後續有所幫助。

該題邊處理,邊更新是一個線上演算法。

二維區間

POJ1328 Radar Installation

Description

Assume the coasting is an infinite straight line. Land is in one side of coasting, sea in the other. Each small island is a point locating in the sea side. And any radar installation, locating on the coasting, can only cover \(d\) distance, so an island in the sea can be covered by a radius installation, if the distance between them is at most d.

We use Cartesian coordinate system, defining the coasting is the x-axis. The sea side is above x-axis, and the land side below. Given the position of each island in the sea, and given the distance of the coverage of the radar installation, your task is to write a program to find the minimal number of radar installations to cover all the islands. Note that the position of an island is represented by its x-y coordinates.

Figure A Sample Input of Radar Installations

Input

The input consists of several test cases. The first line of each case contains two integers n (1<=n<=1000) and d, where n is the number of islands in the sea and d is the distance of coverage of the radar installation. This is followed by n lines each containing two integers representing the coordinate of the position of each island. Then a blank line follows to separate the cases.

The input is terminated by a line containing pair of zeros

Output

For each test case output one line consisting of the test case number followed by the minimal number of radar installations needed. "-1" installation means no solution for that case.

Sample Input

3 2
1 2
-3 1
2 1

1 2
0 2

0 0

Sample Output

Case 1: 2
Case 2: 1

題目大意為:在陸地(x軸及其以下)上安裝雷達,雷達的覆蓋半徑為\(d\),要求用最少數量的雷達覆蓋所有的島嶼。假如不能實現,則輸出\(-1\)

這題乍一看屬於二維層面的(類似計算機集合)。但是他最後問的是最值問題,我們不難想到用貪心法嘗試解決。實際上有多重貪心思路,並且很容易使用錯誤的貪心思路

我們首先介紹一種錯誤的貪心思路並舉出其反例:先對島嶼進行排序,從最左邊開始,對每個島嶼求出能覆蓋它且\(x\)值最大的雷達位置,跳過被該雷達覆蓋的島嶼,從下一個未被覆蓋的島嶼開始。

該想法犯了一個錯誤為:錯把圓形當矩形,如何解釋這句話?

設當前點為\(A - (x_a,y_a)\),雷達位置為\((x_i,0)\),滿足\((x_i - x_a)^2 + y_a^2 = d^2\)。該想法認為雷達可擺放區間\([2x_a - x_i, xi]\)中,最右端一定能滿足覆蓋島嶼數量最多的條件。但由於覆蓋範圍是圓形,舉極端條件:某個點\(B\)的位置為\((x_b, d)\),則點必須在其正下方。

eg:
 2 5
-5 3
-3 5
按照錯誤思路為: (-5, 3)可放雷達的最右端為-1,但是(-1, 0)無法覆蓋(-3, 5),因此答案為2
而實際上在(-3, 0)擺放雷達可同時覆蓋兩個,正確答案為1

可以說,上述思路忽視了二維區間的問題,將他簡單的想象為一維區間貪心問題。

實際上,換一個思路問題便簡單很多,我們尋找每個島嶼對應的雷達可安裝範圍(一個區間),轉化為:存在多個一維區間,找到一種方案,使得每個區間上都存在一個點,且總點數最小。

解決這個一維區間問題便簡單多了,通過排序貪心選取即可。(排序左右端點都可)

解題思路:

  1. 進行降維,找到每個島嶼的可安裝雷達區間,轉化為一維區間
  2. 對右端點排序,貪心選取右端點,重疊區間便跳過(排序左端點亦可,後續程式碼便排序了左端點)
struct Point{
    int x, y;
    Point(): x(0), y(0) {}
    Point(int _x, int _y): x(_x), y(_y) {}
};

vector<PDD> segment;

// 降維為一維區間
PDD get_segment(Point island, int d){
    db dlt = sqrt((d * d) - (island.y * island.y));
    return MP(island.x - dlt, island.x + dlt);
}

// 區間貪心, 二維轉一維
// 右端點為浮點數,注意型別一致問題! 
int n, d;
void solve(){

	int cnt = 1;
    if (n == 0 && d == 0) break;
    segment.clear();

    int cx, cy, flag = 0;
    for (int i = 0; i < n; ++ i){
        cin >> cx >> cy;
        if (cy > d || cy < 0) { flag = 1; continue; } // 輸入沒有結束,不要直接break
        PDD seg = get_segment({cx, cy}, d);
        segment.push_back(seg);
    }

    /* 存在不可到達island */
    if (flag){
        printf("Case %d: -1\n", cnt++);
        continue;
    }else{
        // 對左端點排序
        sort(segment.begin(), segment.end());

        int res = 0;
        db end; // 尾部斷點為浮點數!
        for (int i = 0; i < segment.size(); ++ i){
            if (i == 0) { end = segment[i].second; ++ res; continue; }

            if (segment[i].second < end){
                end = segment[i].second; // 若區間右端點小於當前區間右端點,則更新
            }else {
                if (segment[i].first <= end) continue;
                else {
                    end = segment[i].second;
                    ++ res;
                }
            }
        }
        printf("Case %d: %d\n", cnt++, res);
    }
}

總結:

面對一維區間,首先嚐試排序左右端點,或者區間長度,設計合理的貪心策略進行求解,切忌盲目自信

面對二維區間等多維區間,可以先嚐試進行降維,轉化到熟悉的一維區間進行處理。

\(\mathcal{C}\) 易錯雜項

POJ3262 Protecting the Flowers

Description

Farmer John went to cut some wood and left \(N\) (2 ≤ \(N\) ≤ 100,000) cows eating the grass, as usual. When he returned, he found to his horror that the cluster of cows was in his garden eating his beautiful flowers. Wanting to minimize the subsequent damage, FJ decided to take immediate action and transport each cow back to its own barn.

Each cow i is at a location that is \(T_i\) minutes (1 ≤ \(T_i\) ≤ 2,000,000) away from its own barn. Furthermore, while waiting for transport, she destroys \(D_i\) (1 ≤ \(D_i\) 100) flowers per minute. No matter how hard he tries, FJ can only transport one cow at a time back to her barn. Moving cow i to its barn requires \(2 × T_i\) minutes (\(Ti\) to get there and \(T_i\) to return). FJ starts at the flower patch, transports the cow to its barn, and then walks back to the flowers, taking no extra time to get to the next cow that needs transport.

Write a program to determine the order in which FJ should pick up the cows so that the total number of flowers destroyed is minimized.

Input

Line 1: A single integer N
Lines 2..N+1: Each line contains two space-separated integers, Ti and Di, that describe a single cow's characteristics

Output

Line 1: A single integer that is the minimum number of destroyed flowers

Sample Input

6
3 1
2 5
2 3
3 2
4 1
1 6

Sample Output

86

Hint

FJ returns the cows in the following order: 6, 2, 3, 4, 1, 5. While he is transporting cow 6 to the barn, the others destroy 24 flowers; next he will take cow 2, losing 28 more of his beautiful flora. For the cows 3, 4, 1 he loses 16, 12, and 6 flowers respectively. When he picks cow 5 there are no more cows damaging the flowers, so the loss for that cow is zero. The total flowers lost this way is 24 + 28 + 16 + 12 + 6 = 86.

本題大意是牛距離家\(T_i\)分鐘的路程且其每分鐘能吃\(D_i\)朵花,每牽一頭牛需要\(T_i\)的時間(來回),則如何安排能使得損失最小。

本題有兩個關鍵量,\(T_i\)\(D_i\)。面對這種問題有多個引數且看起來是貪心法的問題,我們通過比較兩個個體之間的優劣進行排序,再進行選擇。

考慮有兩頭牛\(A,B\),每頭牛都有兩個屬性即:\(t(time),d(damgae)\)。假如選牛\(A\),則損失為:\(2 * B.d * A.t\),假如選擇牛\(B\),則損失為\(2*A.d * B.t\),我們要求損失最小,因此比較\(\frac{A.t}{A.d}\)的大小即可,越小的說明該方案越優(姑且稱為效率

所以我們自定義比較函式,排序後選擇最小的即可。

#define LL long long
const int maxn = 1e6 + 50;
struct cow{
    LL t, d;
    // bool operator< (const cow &b){
    //     return this->t * b.d < this->d * b.t;
    // } C++ 11才允許
}C[maxn];
int sum[maxn];

bool cmp(const cow &a, const cow &b){
    return a.t * b.d <= a.d * b.t;
}

int n; // input 
void solve(){
    memset(sum, 0, sizeof(sum));
    for (int i = 0; i < n; ++ i){
        scanf("%lld %lld", &C[i].t, &C[i].d);
    }

    sort(C, C + n, cmp);
    for (int i = 0; i < n; ++ i){
        if (i == 0) sum[i] = C[i].d;
        else sum[i] = sum[i - 1] + C[i].d;
    }

    LL res = 0;
    for (int i = 0; i < n; ++ i){
        res += (2 * C[i].t * (sum[n - 1] - sum[i]));
    }
    printf("%lld\n", res); // res 是LL啊! 
}

總結:面對多引數問題,通過比較兩個個體的優劣缺點排序優先順序,進而