Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17528. FsImageValidation: set txid when saving a new image #6828

Open
wants to merge 8 commits into
base: trunk
Choose a base branch
from

Conversation

szetszwo
Copy link
Contributor

@szetszwo szetszwo commented May 14, 2024

Description of PR

HDFS-17528

  • When the fsimage is specified as a file and the FsImageValidation tool saves a new image (for removing inaccessible inodes), the txid is not set. Then, the resulted image will have 0 as its txid.
  • When the fsimage is specified as a directory, the txid is set. However, it will get NPE since NameNode metrics is uninitialized (although the metrics is not used by FsImageValidation).

How was this patch tested?

Tested manually

  • before: the output file is fsimage.ckpt_0000000000000000000 (i.e. txid is 0)

2024-05-14 13:37:27,531 [main] INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:save(732)) - Saving image file .../fsimage/current/newFsImage5968764763996132609/current/fsimage.ckpt_0000000000000000000 using no compression
2024-05-14 13:37:30,522 [main] INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:save(736)) - Image file .../fsimage/current/newFsImage5968764763996132609/current/fsimage.ckpt_0000000000000000000 of size 200392059 bytes saved in 2 seconds .

  • after: the output file is fsimage.ckpt_0000000023945925442 with correct txid

2024-05-14 13:38:32,414 [main] INFO namenode.FSImage (FSImage.java:save(1223)) - save fsimage with txid=23945925442 to .../fsimage/current/newFsImage4409944859316006440
2024-05-14 13:38:32,436 [main] INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:save(732)) - Saving image file .../fsimage/current/newFsImage4409944859316006440/current/fsimage.ckpt_0000000023945925442 using no compression
2024-05-14 13:38:35,437 [main] INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:save(736)) - Image file .../fsimage/current/newFsImage4409944859316006440/current/fsimage.ckpt_0000000023945925442 of size 200392062 bytes saved in 3 seconds .

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • [NA] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • [NA] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • [NA] If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@szetszwo
Copy link
Contributor Author

@vinayakumarb , thanks a lot for reviewing this!

@szetszwo
Copy link
Contributor Author

The jenkins builds keep getting stuck for a day and then fail. Not sure if it is a known problem?

@szetszwo
Copy link
Contributor Author

szetszwo commented Jun 6, 2024

The last few lines of the Jenkins build before failure:

[2024-06-03T20:14:42.551Z] cd /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-6828/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs
[2024-06-03T20:14:42.551Z] /usr/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-6828/yetus-m2/hadoop-trunk-patch-0 -Dsurefire.rerunFailingTestsCount=2 -Pparallel-tests -P!shelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.zstd -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-6828/ubuntu-focal/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt 2>&1
[2024-06-05T05:43:40.181Z] wrapper script does not seem to be touching the log file in /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-6828@tmp/durable-db261340
[2024-06-05T05:43:40.181Z] (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
script returned exit code -1

@slfan1989
Copy link
Contributor

slfan1989 commented Jun 11, 2024

@szetszwo we can resubmit once and I think the issue is fixed. This issue should be related to upgrading surefire (#6664) , we have revert it.

@szetszwo
Copy link
Contributor Author

@slfan1989 , thanks for the info! Let me resubmit.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 6s trunk passed
+1 💚 compile 1m 20s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 1m 17s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 1m 12s trunk passed
+1 💚 mvnsite 1m 25s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 47s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 16s trunk passed
+1 💚 shadedclient 35m 51s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 1m 13s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 1m 13s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 59s the patch passed
+1 💚 mvnsite 1m 15s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 35s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 14s the patch passed
+1 💚 shadedclient 35m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 226m 34s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 45s The patch does not generate ASF License warnings.
366m 31s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/6/artifact/out/Dockerfile
GITHUB PR #6828
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2eee5c0a5c7a 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 256d41d
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/6/testReport/
Max. process+thread count 4432 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 44m 10s trunk passed
+1 💚 compile 1m 22s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 1m 17s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 1m 10s trunk passed
+1 💚 mvnsite 1m 22s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 46s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 20s trunk passed
+1 💚 shadedclient 35m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 1m 12s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 1m 12s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 1m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 59s the patch passed
+1 💚 mvnsite 1m 16s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 35s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 15s the patch passed
+1 💚 shadedclient 35m 45s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 227m 54s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
366m 20s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/7/artifact/out/Dockerfile
GITHUB PR #6828
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 32f9b1d96c7c 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d61a831
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/7/testReport/
Max. process+thread count 3245 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@szetszwo
Copy link
Contributor Author

Now, the unit tests are failing.

@ayushtkn
Copy link
Member

you can ignore that test failure, I have dropped a comment on the original ticket which caused that, it is failing in our daily build as well.

The previous build crashing was due to the surefire-upgrade which is reverted, there is a thread on hdfs-dev@ about it

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 44m 27s trunk passed
+1 💚 compile 1m 22s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 1m 18s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 1m 12s trunk passed
+1 💚 mvnsite 1m 23s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 44s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 16s trunk passed
+1 💚 shadedclient 35m 39s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 1m 14s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 1m 14s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 58s the patch passed
+1 💚 mvnsite 1m 12s the patch passed
+1 💚 javadoc 0m 54s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 37s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 3m 20s the patch passed
+1 💚 shadedclient 35m 52s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 228m 8s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
366m 19s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/8/artifact/out/Dockerfile
GITHUB PR #6828
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ae1a75664bb4 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 972adb7
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/8/testReport/
Max. process+thread count 3412 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6828/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants