First of all, it is extremely important to understand what sort of a java developer are you. Based on that I can perhaps make a suggestion to enhance your profile.
Apache Hadoop is an ecosystem, a framework, consisting of several components. There are many supplementary components and perhaps you might have even heard of some of them such as Apache Pig, Sqoop, YARN, and so on(spark is also one of them but we will get to that later). All of these components are run in conjunction on several machines.
Think of Hadoop as a builder’s suitcase. It has many:
a) Pens- To Capture the information data( Data Ingestion)
b) Notebooks- To store the created designs(Data Storage)
c) Tools- such as a screwdriver/hammer- to carry out a process(Data Transformation)
and so on.
Hadoop is similar and it encompasses a whole suite of components, which if used effectively, can be used to carry out transformations on Petabytes and TerraBytes of data. This processing tool used by Hadoop is called as MapReduce, and as a java programmer, I feel you should first understand how MR works and how to proceed developing applications on MapReduce in Java.
Spark is another such tool that is much faster than MapReduce, particularly with the way it handles data. It has different operation modes; so you can use it on your local system, or on Hadoop cluster. While running on a hadoop cluster, applications leverage the resources pooled by the machines of your cluster.
SO COMPANIES USUALLY DEPLOY A HADOOP CLUSTER OF 1000s OF NODES AND RUN SPARK APPLICATIONS IN THEM THAT ARE USED TO TRANSFORM DATA IN REALTIME AND PRESENT THE RESULTS TOO!
So Spark Development can also be done in Java, but for that to be effectively used you must have knowledge on what the purpose for transforming the data is, i.e. an analysts purpose. You can learn what to do with data, but what is to derived from data should be learnt first. So my suggestion to you would be to start programming in MapReduce and then then to understand how Hadoop works, undergo the HDPCA certification.
Once you have knoweldge of how Hadoop functions, focus on deriving insights from data and then you should learn Spark.