S3上傳時報錯:Data read has a different length than the expected
阿新 • • 發佈:2022-03-14
報錯資訊
使用S3上傳檔案時,發現存在幾類報錯。
第一種:Data read has a different length than the expected: dataLength=15932; expectedLength=19241;
這類報錯的意思是,在上傳時發現,該檔案的實際長度和期望長度不一致。
完整的報錯堆疊如下:
com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=15932; expectedLength=19241; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ResettableInputStream; markedSupported=true; marked=0; resetSinceLastMarked=false; markCount=1; resetCount=0 at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:151) at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:109) at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180) at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) at com.amazonaws.auth.AwsChunkedEncodingInputStream.setUpNextChunk(AwsChunkedEncodingInputStream.java:306) at com.amazonaws.auth.AwsChunkedEncodingInputStream.read(AwsChunkedEncodingInputStream.java:172) at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:140) at com.amazonaws.http.RepeatableInputStreamRequestEntity.writeTo(RepeatableInputStreamRequestEntity.java:160) at org.apache.http.impl.DefaultBHttpClientConnection.sendRequestEntity(DefaultBHttpClientConnection.java:156) at org.apache.http.impl.conn.CPoolProxy.sendRequestEntity(CPoolProxy.java:160) at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:238) at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:63) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.client.InternalHttpClient.doExecute$original$mo6pBbRM(InternalHttpClient.java:185) at org.apache.http.impl.client.InternalHttpClient.doExecute$original$mo6pBbRM$accessor$0Mzlaxvy(InternalHttpClient.java) at org.apache.http.impl.client.InternalHttpClient$auxiliary$3bqvKzTe.call(Unknown Source) at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:95) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1074) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:745) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:719) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:701) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:669) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:651) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:515) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4443) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4390) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1774) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1628)
第二種:Unable to calculate MD5 hash: /tmp/78c20e3adeb1202ade4ceb002cf4bd9e.png (No such file or directory)
這類報錯的意思是,s3在上傳檔案時,會對檔案做MD5的校驗。在這個過程中發現指定的檔案不存在。
這個堆疊資訊比較少:
com.amazonaws.SdkClientException: Unable to calculate MD5 hash: /tmp/78c20e3adeb1202ade4ceb002cf4bd9e.png (No such file or directory) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1675) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1628)
原因推測
於是推測,第一種報錯應該是因為s3在上傳檔案時,檔案發生了變化導致的。而且可以看到,報錯中基本都是expectedLength的長度大於dataLength的長度。那會不會是在上傳的時候,這個檔案被修改或者重新寫入了?所以在重新寫入的過程中,檔案是不完整的,因此長度不一致。
程式碼排查
於是排查了一下程式碼,發現這部分上傳的邏輯大概是這樣的:
- 用時間戳拼接檔名,生成md5值。把這個值當做s3的key(就叫md5key吧)。
- 直接返回md5key,儲存入庫。之後通過執行緒池非同步做上傳邏輯
- 獲取業務傳入進來的附件連結,將檔案儲存到本地伺服器,檔名是md5key.jpg。
- 呼叫s3的服務,將md5key.jpg進行上傳。
- 刪除伺服器上的md5key.jpg。
問題就出現在這裡!
- 如果業務方傳入多個一樣的附件連結(連結A、連結A、連結A),那麼在處理的過程中,如果都是在同一毫秒去生成md5key,那是不是這三個連結的md5key都是一樣的呢?
- 通過執行緒池去處理這三個檔案時,執行緒1寫入檔案到md5key.jpg,開始上傳。而此時執行緒2也開始寫入檔案到md5key.jpg,這時執行緒1的上傳邏輯會發現,檔案長度不一致,所以上傳失敗。
- 而當執行緒2寫入md5key.jpg並上傳完成後,執行緒3也開始寫入。當執行緒3寫入完成,準備上傳時,這時湊巧執行緒2上傳完成,並刪除了md5key.jpg,那麼執行緒3就會發現檔案不見了,所以報出第二個錯誤,檔案不存在。
排查了異常結果,發現果然是這個原因。併發場景,要考慮的東西還是很多的啊。
結論
- Data read has a different length than the expected這個報錯,很有可能是檔案準備上傳時,被另一個寫入執行緒覆蓋了。可以按照這個思路去排查問題。
- No such file or directory這個報錯,那就是如他所說,找不到檔案。所以想想為啥檔案沒了呢?看看程式裡有沒有刪除檔案的邏輯呢?