1. 程式人生 > 其它 >S3上傳時報錯:Data read has a different length than the expected

S3上傳時報錯:Data read has a different length than the expected

報錯資訊

使用S3上傳檔案時,發現存在幾類報錯。

第一種:Data read has a different length than the expected: dataLength=15932; expectedLength=19241;

這類報錯的意思是,在上傳時發現,該檔案的實際長度和期望長度不一致。

完整的報錯堆疊如下:

com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=15932; expectedLength=19241; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ResettableInputStream; markedSupported=true; marked=0; resetSinceLastMarked=false; markCount=1; resetCount=0
        at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:151)
        at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:109)
        at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
        at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
        at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
        at com.amazonaws.auth.AwsChunkedEncodingInputStream.setUpNextChunk(AwsChunkedEncodingInputStream.java:306)
        at com.amazonaws.auth.AwsChunkedEncodingInputStream.read(AwsChunkedEncodingInputStream.java:172)
        at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:140)
        at com.amazonaws.http.RepeatableInputStreamRequestEntity.writeTo(RepeatableInputStreamRequestEntity.java:160)
        at org.apache.http.impl.DefaultBHttpClientConnection.sendRequestEntity(DefaultBHttpClientConnection.java:156)
        at org.apache.http.impl.conn.CPoolProxy.sendRequestEntity(CPoolProxy.java:160)
        at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:238)
        at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:63)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at org.apache.http.impl.client.InternalHttpClient.doExecute$original$mo6pBbRM(InternalHttpClient.java:185)
        at org.apache.http.impl.client.InternalHttpClient.doExecute$original$mo6pBbRM$accessor$0Mzlaxvy(InternalHttpClient.java)
        at org.apache.http.impl.client.InternalHttpClient$auxiliary$3bqvKzTe.call(Unknown Source)
        at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:95)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1074)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:745)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:719)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:701)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:669)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:651)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:515)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4443)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4390)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1774)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1628)

第二種:Unable to calculate MD5 hash: /tmp/78c20e3adeb1202ade4ceb002cf4bd9e.png (No such file or directory)

這類報錯的意思是,s3在上傳檔案時,會對檔案做MD5的校驗。在這個過程中發現指定的檔案不存在。

這個堆疊資訊比較少:

com.amazonaws.SdkClientException: Unable to calculate MD5 hash: /tmp/78c20e3adeb1202ade4ceb002cf4bd9e.png (No such file or directory)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1675)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1628)

原因推測

於是推測,第一種報錯應該是因為s3在上傳檔案時,檔案發生了變化導致的。而且可以看到,報錯中基本都是expectedLength的長度大於dataLength的長度。那會不會是在上傳的時候,這個檔案被修改或者重新寫入了?所以在重新寫入的過程中,檔案是不完整的,因此長度不一致。

程式碼排查

於是排查了一下程式碼,發現這部分上傳的邏輯大概是這樣的:

  1. 用時間戳拼接檔名,生成md5值。把這個值當做s3的key(就叫md5key吧)。
  2. 直接返回md5key,儲存入庫。之後通過執行緒池非同步做上傳邏輯
    1. 獲取業務傳入進來的附件連結,將檔案儲存到本地伺服器,檔名是md5key.jpg。
    2. 呼叫s3的服務,將md5key.jpg進行上傳。
    3. 刪除伺服器上的md5key.jpg。

問題就出現在這裡!

  1. 如果業務方傳入多個一樣的附件連結(連結A、連結A、連結A),那麼在處理的過程中,如果都是在同一毫秒去生成md5key,那是不是這三個連結的md5key都是一樣的呢?
  2. 通過執行緒池去處理這三個檔案時,執行緒1寫入檔案到md5key.jpg,開始上傳。而此時執行緒2也開始寫入檔案到md5key.jpg,這時執行緒1的上傳邏輯會發現,檔案長度不一致,所以上傳失敗。
  3. 而當執行緒2寫入md5key.jpg並上傳完成後,執行緒3也開始寫入。當執行緒3寫入完成,準備上傳時,這時湊巧執行緒2上傳完成,並刪除了md5key.jpg,那麼執行緒3就會發現檔案不見了,所以報出第二個錯誤,檔案不存在。

排查了異常結果,發現果然是這個原因。併發場景,要考慮的東西還是很多的啊。

結論

  1. Data read has a different length than the expected這個報錯,很有可能是檔案準備上傳時,被另一個寫入執行緒覆蓋了。可以按照這個思路去排查問題。
  2. No such file or directory這個報錯,那就是如他所說,找不到檔案。所以想想為啥檔案沒了呢?看看程式裡有沒有刪除檔案的邏輯呢?