1. 程式人生 > >Codeforces 24D 期望DP 解題報告

Codeforces 24D 期望DP 解題報告

D. Broken robot

You received as a gift a very clever robot walking on a rectangular board. Unfortunately, you understood that it is broken and behaves rather strangely (randomly). The board consists of N rows and M columns of cells. The robot is initially at some cell on the i-th row and the j-th column. Then at every step the robot could go to some another cell. The aim is to go to the bottommost (N-th) row. The robot can stay at it’s current cell, move to the left, move to the right, or move to the cell below the current. If the robot is in the leftmost column it cannot move to the left, and if it is in the rightmost column it cannot move to the right. At every step all possible moves are equally probable. Return the expected number of step to reach the bottommost row.

Input

On the first line you will be given two space separated integers N and M (1 ≤ N, M ≤ 1000). On the second line you will be given another two space separated integers i and j (1 ≤ i ≤ N, 1 ≤ j ≤ M) — the number of the initial row and the number of the initial column. Note that, (1, 1) is the upper left corner of the board and (N, M) is the bottom right corner.

Output

Output the expected number of steps on a line of itself with at least 4 digits after the decimal point.
Examples
input
10 10
10 4
output
0.0000000000
input
10 14
5 14
output
18.0038068653

f[i][1]=(f[i][1]+f[i][2]+f[i+1][1])/3+1;

f[i][j]=(f[i][j-1]+f[i][j]+f[i][j+1]+f[i+1][j])/4+1; j∈[2,m-1]

f[i][m]=(f[i][m]+f[i][m-1]+f[i+1][m])/3+1;

如果對上述的方程進行simple的高斯消元,很明顯是不行的…..

但還是要消一遍先~,得到:

f[i][1]=(3+f[i][2]+f[i+1][1])/2;

f[i][j]=(4+f[i][j+1]+f[i][j-1]+f[i+1][j])/3;

f[i][m]=(3+f[i][m-1]+f[i+1][m])/2;

不妨設f[i+1]是已知的,那麼,我們對該式子做一些細微調整。

以f[i][1]舉例 f[i][1] = (3+f[i][2]+f[i+1][1])/2 = (3+f[i+1][1])/3+1/2f[i][2]。

我們設(3+f[i+1][1])/3為A,1/2為B。

則f[i][2]= (4+f[i][3]+f[i][1]+f[i+1][j])/3 = (4+f[i][3]+A+B*f[i][2]+f[i+1][2])/3

化簡後得到 f[i][2]=(4+f[i][3]+A+f[i+1][2])/(3-B)。

與化簡f[1]的方式相同,將f[i][2]化為A+Bf[i+3],不難得到F[i][2]=(4+A+f[i+1][2])/(3-B) + 1/(3-B)*f[i][3],即A’=(4+A+C)/(3-B),B’=1/(3-B)。 f[i][3…m-1]的推法與f[i][2]相同。(C為f[i+1][2])

下面考慮f[i][m],在此之前,我們已經求得f[i][m-1]=A+B*f[i][m]。 通過前面的推論我們得知:f[i][m]=(3+f[i][m-1]+f[i+1][m])/2,代入後得到f[i][m]=(3+A+B*f[i][m]+f[i+1][m])/2。

化簡後得到f[i][m]=(3+A+f[i+1][m])/(2-B)。f[i][m]的準確值終於被求出來了…..,由於先前的f[i][j]均推出了f[i][j]=A+B*f[i][j+1],所以該行的所有f值將全部求出。

這一過程重複n-x+1次即可。時間複雜度為O((n-x+1)*m)。

本題有個小坑:當m=1時,上文描述的轉移無效,此情況下轉移為f[i][1]=f[i+1][1]+2。

程式碼如下:

#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
#define N 1010

int n,m,x,y;
double dp[N][N],a[N],b[N];

int main()
{
    scanf("%d%d%d%d",&n,&m,&x,&y);
    if(m==1)
    {
        a[n]=0;
        for(int i=n-1;i>=x;--i) a[i]=a[i+1]+2;
        printf("%.4f\n",a[x]);
    }
    else
    {    
        for(int i=1;i<=m;++i) dp[n][i]=0;
        for(int i=n-1;i>=x;--i)
        {
            a[1]=0.5;b[1]=dp[i+1][1]/2+1.5;
            for(int j=2;j<m;++j)
            {
                b[j]=b[j-1]/4.0+dp[i+1][j]/4.0+1.0;
                a[j]=0.25;
                a[j]/=(0.75-a[j-1]/4.0);
                b[j]/=(0.75-a[j-1]/4.0);
            }
            dp[i][m]=(b[m-1]+dp[i+1][m]+3.0)/(2-a[m-1]);
            for(int j=m-1;j>=1;--j) dp[i][j]=b[j]+a[j]*dp[i][j+1];
        }
        printf("%.10f\n",dp[x][y]);
    }
    return 0;
}