Friday, September 26, 2008

My Artifactory Versus Nexus Experience

Like many software development teams, my team needed a decent internal maven repository. So, one of my trusted colleagues who just happens to be the smartest Java guy I know setup Artifactory 1.2.5. For those of you that haven't taken the time to setup an internal Maven repository, just know that you can get up and running with very little effort.

Some of the benefits of an internal maven repository
  1. Proxy and cache JARs from 3rd party repos. This will usually decrease maven build time for first time builders. Of course, subsequent builds usually pull jars locally from the developer's .m2 cache.
  2. Turn off proxies once all needed 3rd party JARs are available in your repo. This will reduce the risk of introducing broken dependencies, corrupt JARs or unwanted dependencies.
  3. Deploy internally developed artifacts to a repository. Those artifacts can be downloaded by other internal development teams, but unavailable to the general public. Obviously, this is desirable for software companies and corporate internal IT teams.
There are other benefits, but these are the big three that motivated our team to setup a maven repository. We chose Artifactory 1.2.5 with default settings.

Some key aspects to Artifactory
  1. Jetty based unzip and go solution. But a WAR is available that can be deployed to Tomcat if desired.
  2. Artifactory uses a JCR based repository called JackRabbit. I really don't like this design choice.
  3. Artifactory is user friendly and relatively straightforward to setup, assuming you are comfortable hand editing a single XML file for configuration.
  4. If you want to run Artifactory as a windows service, you must create your own JavaService wrapper.
  5. There is adequate security and user/role management.
Artifactory would still be in place right now if it weren't for one serious problem. Once our WAR became rather large (45MB), we started experiencing 500 errors that appeared to be caused by JCR node lock issues that made it impossible to deploy new internal artifacts. This crippled our development team in the middle of a very important sprint. In all fairness to the Artifactory guys, I didn't reach out to them for help. I googled for 1.2.5 issues and a common fix suggestion was to upgrade to 1.3.0-Beta. After googling for 1.3.0 upgrade suggestions, it became apparent to me that I would have to re-install and re-configure Artifactory. Since my manager, the second smartest Java guy I know sent me a link to Nexus a couple weeks earlier, I felt compelled to give it a go.

Here's my take on Nexus
  1. Nexus is a Jetty unzip and go only offering. There is no easy way to run it in Tomcat.
  2. There is no need to hand edit a XML config file. All of the setup can be done through the GUI. The GUI is ExtJS based and prettier than Artifactory, although I don't give a hoot about that.
  3. Nexus provides a JavaService wrapper for Windows.
  4. Nexus has a really nice feature that allows you to merge several repositories and publish them via a single URL. There is no need to have multiple URL entries in your pom.xml file or conduct tedious repository administration.
  5. Nexus stores artifacts on the file system. It is easy to browse/fix these files outside of the GUI if something goes wrong.
  6. Nexus time stamps all deployed artifacts. I'm not a huge fan of this because it makes the disk consumption higher than artifactory, but you can setup a scheduled task or cron job to clean up old JARs.
  7. Nexus URLs for the latest version of an artifact are redirect URLs. Artifactory URLs don't ue re-directs and are easier to embed in scripts.
  8. There is adequate security and user/role management.
Verdict

The Artifactory deploy issue confirmed my hatred of Jackrabbit. There are times when a JCR repository makes sense, however I don't believe a maven repository warrants a JCR backend. I think a vanilla fileystem based repository makes more sense and is easier to manage and fix. I think both projects are very good choices, but I believe Nexus 1.0.0 is better than Artifactory 1.2.5. Since both projects are in active development it is important for you to assess the latest stable offering of both projects and pick the one you like the best.

12 comments:

Brian Fox said...

Hi James, thanks for reviewing Nexus.

One comment, it's actually Maven that is making the timestamped snapshots, not Nexus. Nexus stores the files "PUT" by Maven. If you don't want to use timestamped snapshots, there is an option in the distributionManagement section to tell Maven not to use timestamps: http://maven.apache.org/pom.html#Repository

Yoav Landman said...

Hello James,
A couple of comments on Artifactory:

1. Like you mentioned, the locking issues you had are resolved in 1.3.0-Beta. Those issues, however, were pure implementation and not related to JCR. They can also be worked around in 1.2.5 by using the webdav http wagon. Jackrabbit is not always perfect but it still provides the best platform for concurrent, transactional metadata management.
2. Upgrading to 1.3.0-Beta doesn't require any re-install or re-config. Even though still in beta, we provide direct upgrade path to 1.3 from any legacy version.
3. Artifactory does have repository grouping (in 1.2.5). This feature is called Virtual Repositories. You can also nest and reuse virtual repository definitions.
4. Maven is the one responsible for time-stamping snapshots, true. Using Artifactory, however, you can centrally control the format of snapshots, regardless (or respecting) what users chose to put in their poms. That's why you got clean snapshots (see also: http://tinyurl.com/3f5brc).

We are working hard to provide the next version of Artifactory with new features, some already in beta. So like you offer - keep watching ;)

Deli said...

> Nexus is a Jetty unzip and go only offering. There is no easy way to run it in Tomcat.

That is the single biggest reason to not use Nexus. Since we couldn't install it on Tomcat with our JIRA, Confluence, Hudson, etc. we went with Artifactory. I don't care what Nexus' reason is for that decision, I'm not using it because of that.

James Williams said...

Brian,

Thank you for the clarification on timestamped snapshots. I mis-interpreted the repository behavior I think, because Artifactory 1.2.5 either hides the time stamps when browsing through the GUI or the uploader some something special to convert the timestamped jars. I learn something new about Maven everyday. :)

James Williams said...

Yoav,
Thank you for clarification as to the root cause of the JCR node lock issue. My disdain for Jackrabbit has to do with past experience with a portal project that used Jackrabbit. I should have researched the 1.3.0 upgrade a bit longer before migrating to Nexus. I concede that artifactory does have repository grouping via virtual repositories but I do think that the GUI and inclusion/exclusion capabilities in Nexus are a bit better. We were using the virtual repositories feature of Artifactory and it was quite adequate. But, I was impressed with the way that Nexus implemented this feature.

Khai said...

A note about Artifactory. I have have installed and been using Artifactory (ver 1.2.5) for about 6 months. I deployed it in Tomcat 5.5 on windows 2003 using the Artifactory DB. It worked great for a few months until our repository started getting really large. Our internal repository is about about 100 GB now and once it started getting up to this size Artifactory started getting data corruption errors. Since then I've experienced data corruption in the DB a few times already. I do have backups setup to save to zip. However the zip files are also corrupted. I'm not sure if the data caused the corruption or if zips just do not scale to a certain size. In any case I have concluded that artifactory does not scale once your repository gets large. Also it's very difficult to restore an Artifactory DB using a local repository. I am now evaluating nexus to see if it scales better than Artifactory.

Yoav Landman said...

@Khai:
Your DB corruption might have been caused by a bug with the Derby version that was shipped with 1.2.5. This was resolved with Artifactory 1.3.0 where we also use in-place incremental backups, which are faster, very reliable and consume less system resources.
You can read more in my comments to the relevant JIRA issue here.

Steven said...

I think Yoav has glossed over an important point that Brian noted.

We too have been stung by Artifactory 1.2.5 rc0 getting corrupt artifacts (Internal server error 500) and apparent uncontrolled growth of the DBDerby database.

Once the corruption occurs it appears to be impossible to "backup" the system. Without backup, hand re-publishing of artifacts is required - at least for any repositories that has at least one broken artifacts. One bad artifact spoils the whole bucket.

I'm quite nervous about upgrading to 1.3.0 at this point rather than going with a different solution.

One of the significant downsides of using a system like Artifactory is that when it blows up - even for one dependent artifact - every developer is impacted. It's a single point of failure.

I also made the mistake of using Artifactory as a "software respository" for our delivered solutions, e.g. war files. Unfortunately, there does not appear to be a way to expose the complete artifact name. For example, upload myartifact.1.2.3-57.0911090544.jar and it is stored, known, and downloadable only as "myartifact.1.2.3.jar" The build number and timestamps are no longer visible. I understand that this is a misuse of the tool, but it seemed like a good idea at the time!

Can Nexus be my artifact repository and my "released software repository" and can it include the complete uploaded name rather than just the artifactId and version?

Brian Fox said...

Hi Steven,
This is exactly why we insist on using a flat m2 layout on disk for Nexus. It means that should some corruption occur, even at the disk level, it would only affect a few files and not the entire system. Plus you can use standard tools for incremental backups and replication.

In an absolute emergency, the storage for Hosted and Proxy repos can be served straight up by httpd and Maven would be perfectly happy.

Nexus stores the files exactly as they are "put" by Maven and they are retrievable that way. There's no monkey business with removing the timestamps for snapshots etc. We do provide some simple REST apis that allow you to retrieve dynamically changing snapshots from a single constant url, but the files are always unmodified.

Yoav Landman said...

Hi Steven,

Artifactory does allow you to expose the "full" artifact name. The default is to use non-unique snapshots (to save disk space and as good practice), but the option is there for you to decide whether the repository's snapshots policy should respect the maven client settings or enforce unique or non-unique snapshot names.

There are good reasons why Artifactory is using a transactional storage medium - i.e. database, which are probably similar reasons why apt-get, yum, svn, mercurial etc. went away from a vanilla FS-based storage, and many of the features and artifact-coupled metadata that we added are nearly impossible to achieve reliably with a FS. Restore does require using the provided tools, but you gain a different level of data integrity (true - not with an rc0 and a bad Derby version). In both cases - you need to maintain backups... The points you brought up were certainly not overlooked, and since 1.2.5rc0 (more than a year ago) Artifactory has gone a long way into being ultra reliable for its users by testing with long-stable Derby versions, providing highly reliable backups and now, running on MySql to leverage its excellent backup and restore tools.

rduht said...

The World Leading wow power leveling and wow gold wow power leveling

Oxana said...

Did you see the new Artifactory 2.0? Looks like they are on top again.
See my other comments on http://abbmp3.com/.