Tag Archives: java

Collecting java.util.logging to log4j2

Everybody wants to write a log. And in Java everybody wants to write their own logging framework or at least use of the many different ones. Then someone comes up with logging framework framework such as SLF4J.

OK but what was I about to say. As so many times, I had a piece of Java software writing a log file using Log4J2. I was using some libs/someone elses code that uses java.util.logging to write their log. I wanted to capture those logs and include them in my Log4J2 log file for debugging, error resolution or whatever.

This case was when trying to log errors from the InfluxDB Java driver. The driver uses java.util.logging for minimal external dependencies or something. I used Log4J2 in my app.

So the usual question of how do you merge java.util.logging code, that you do not control, with your own code using Log4J2 to produce a single unified log file?

Most Googling would tell me all about SLF4J etc. I did not want yet-another framework on top of existing frameworks, and yet some more (transitive) dependencies and all sorts of weird stuff. Because I am old and naughty and don’t like too many abstractions just because.

So the code to do this with zero external dependencies.

First a log Handler object for java.util.logging to write to Log4J2:

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

import java.util.logging.Handler;
import java.util.logging.Level;
import java.util.logging.LogRecord;

/**
* @author Daddy Bigbelly.
*/
public class JekkuHandler extends Handler {
//notice that this is the Log4J2 logger here, inside a java.util.logging Handler object
private static final Logger log = LogManager.getLogger();

  @Override
  public void publish(LogRecord record) {
    Level level = record.getLevel();
    if (level.intValue() == Level.SEVERE.intValue()) {
      log.error(record.getMessage(), record.getThrown());
    } else if (level.intValue() >= Level.INFO.intValue()) {
      log.info(record.getMessage(), record.getThrown());
    } else {
      log.debug(record.getMessage(), record.getThrown());
    }
  }

  @Override
  public void flush() {}

  @Override
  public void close() throws SecurityException {}
}

Next setting it up and using it, with the InfluxDB Java driver as an example:

import org.influxdb.InfluxDB;
import org.influxdb.InfluxDBFactory;
import org.influxdb.dto.BatchPoints;
import org.influxdb.dto.Point;
import org.influxdb.dto.Query;
import org.influxdb.impl.BatchProcessor;

import java.io.IOException;
import java.util.concurrent.TimeUnit;
import java.util.logging.ConsoleHandler;
import java.util.logging.FileHandler;
import java.util.logging.Formatter;
import java.util.logging.Handler;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.logging.SimpleFormatter;

/**
* @author Daddy Bigbelly.
*/

public class LogCaptureExample {
  public static void main(String[] args) throws Exception {
    //oh no the root password is there
    InfluxDB db = InfluxDBFactory.connect("http://myinfluxdbhost:8086", "root", "root");
    String dbName = "aTimeSeries";
    db.createDatabase(dbName);
    db.enableBatch(2000, 1, TimeUnit.SECONDS);

    //if you look at the influxdb driver code for batchprocessor, 
    //where we wanted to capture the log from, you see it using the classname to set up the logger. 
    //so we get the classname here and use it to hijack the writes for that logger (the one we want to capture)
    System.out.println(BatchProcessor.class.getName());
    Logger logger = Logger.getLogger("org.influxdb.impl.BatchProcessor");
    Handler handler = new JekkuHandler();
    logger.addHandler(handler);

    //this runs forever, but the batch mode can throw an error if the network drops.
    //so disconnect network to test this in middle of execution
    while (true) {
      Point point1 = Point.measurement("cpu")
        .time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
        .addField("idle", 90L)
        .addField("user", 9L)
        .addField("system", 1L)
        .build();
      db.write(dbName, "autogen", point1);
    }
  }
}

You could probably quite easily configure a global java.util.logger that would capture all logging written with java.util.logging this way. I did not need it so its not here.

In a similar way, you should be able to capture java.util.logging to any other log framework just by changing where the custom Handler writes the logs to.

Well there you go. Was that as exciting for you as it was for me?

Performance testing with InfluxDB + Grafana + Telegraf, Part 2

Previously I ran my performance test in a local test environment. However, the actual server should run in the Amazon EC2 cloud on one of the free tier micro instances (yes, I am cheap but it is more of a hobby project bwaabwaa.. ūüôā ). So I installed the server on my EC2 micro instance and ran the same tests against. The test client(s) on my laptop, the server in EC2..

So what did I expect to get? Maybe a bit slower but mainly the same performance. Well, lets see..

Here is the start of the session:

init_2freq

Couple of things to note. I added a chart showing how many different IP addresses the sessions are coming from (middle line here). This is just one due to hitting them all from my laptop. In the production version this should be more interesting. Also, the rate at which the tester is able to create new sessions and keep them running is much bumbier and slower than when running the same tests locally. Finally, there is the fact that the tester is observing a much bigger update frequency variation.

To examine if this delay variation is due to server load or network connections, I tried turning off the tester frequency line and show only the server line:

init_1freq

And sure enough, on the server end there is very little variation. So the variation is due to network load/speed. Not a real problem here since most likely there would not be hundreds of clients connection over the same IP line/NAT router. However, the rate at which my simple tester and single IP approach scales up is much slower and gets slower over time:

1300sessions

So I run it for about 1300 clients as shown above. For a more extensive tests I should really provision some short-term EC2 micro instances and use those to load test my server instance. But lets look at the resource use for now:

1300sessions_mem

So the EC2 micro instance has a maximum of 1GB memory, out of which the operating system takes its own chunk. I started the server JVM in this case without remembering to set the JVM heap limit specifically, so in this case it is set to about 256MB. However, from my previous tests this was enough for pretty much the 4000 clients I tested with so I will just go with that for now.

And how about the system resources? Lets see:

byte_count

CPU load never goes above 30% so that is fine. System has memory left so I can allocate more for the JVM if I need so fine. Actually it has more than shown above as I learned looking at this that there is a bunch of cached and buffered memory that is also available although not listed as free. At least for “top”.. ūüôā

But more interestingly, the “no title” chart at the bottom is now again the network usage chart that on my OSX local instance did not work. In the EC2 environment Telegraf seems to be able to capture this data. This is very useful as in EC2 usage they also charge you for your network traffic. So I need to be able to monitor it and to figure out how much network traffic I will be generating. This chart shows I have initially used much more inbound traffic (uploading my server JAR files etc). However, as the number of active clients rises to 1300+, the amount of transferred “data out” also rises quite fast and passes “data in” towards the end. This tells me if the app ever got really popular I would probably be hitting the free tier limits here. But not a real problem at this point, fortunately or unfortunately (I guess more unfortunately ūüôā ).

That is it for the EC2 experiment at this time. However, one final experiment. This time I tried in my local network with the server on one host and the tester on another. Meaning the traffic actually has to hit the router. Would that be interesting? I figured not but there was something..

It looked like this:

mini

What is interesting here? The fact that the session rate starts to slow down already, and the frequency variation for the tester gets big fast after about 2000 active sessions. So I guess the network issue is not so much just for EC2 but for anything passing beyond localhost. Conforting that, in the sense that the connection to the much further EC2 instance does not seem to make such as big difference as I initially thought from my EC2 tests.

Finally, an interesting part to investigate in this would be to run this local network test while also using SNMP to query the router for various metrics. As I have previously also written a Python script to do that, I might try that at one point. Just because I am too curious.. But not quite now. Cheers.

Performance testing with InfluxDB + Grafana + Telegraf, Part 1

Previously I played around with ElasticSearch + Kibana, which was nice for visualizing some activities on my Minecraft server. I then proceeded to try the same for some server performance metrics data such as CPU and memory use (collected using some Python scripts I created with the psutil library). ES + Kibana are absolutely great tools for analyzing textual data such as logs. Not so good and effective for numerical time-series type data I would say. Maybe this is why it is so often combined with stuff like Logstash etc.

Anyway, after looking around, I then found InfluxDB and Grafana as more suitable alternatives for my performance data. And what is even nicer is, there is now a nice and shiny tool from the InfluxDB folks called Telegraf that collects all those system metrics for me. And since shiny pictures are nice, I thought I would post some of those here.

In this occasion, I am testing a simple backend server for an Android app. Clients register to the server, which then sends updates for subscribed sensors at 1 second interval to each client. The client is the Android app in this case.. My tests run from a single laptop and test how much load the server can handle, by continuously registering new clients to receive more updates.

Off we go, with the initial tests. I first implemented my test client using plain Java SSL sockets. The app uses a custom binary protocol and it is just cost-effective to keep the connection open and push the data through. Which is why the plain socket approach. I used to types of probes to collect the data. The Telegraf plugins for basic system resources, such as CPU, memory and network traffic. And a few custom probes I inserted into the application and into the tester to collect additional data.

Anyway, pics you say:

socket_sessions

This one shows the custom metrics I collected from the server/tester. The memory chart contains four different values. The “mem_used”¬†metric measures the amount of reserved RAM in the JVM. The “mem_total” measures the total amount of RAM the JVM has allocated of all the possible RAM it has been given. In this case, I gave the JVM a max value of 1G RAM to allocate. This chart shows the JVM never allocates more than¬†250MB of this, so this is plenty. This is reflected here in the “mem_free_max” metric, which shows how much more at the most the server could allocate beyond what the JVM has already allocated.

The session count metric is a measure of how many clients have connected to the server a time (active sessions).¬†The “frequency” metric here is a measure of how much time on average is at each point taken for the server to send the requested data updates to the clients. This chart shows the deviation from the expected 1000 milliseconds is at most 4 milliseconds, so well within range.

These “custom” metrics are all collected using my own measurement agents I deployed both at the server and tester end. With the server, I can also use there during production¬†deployment to keep monitoring the performance, so they are useful beyond performance testing as well.

In this initial try with the plain sockets approach, I got up to around 2000 client connections. At this point, I ran into the operating system thread limit. I tried to get around that with another approach but more on that later.

For metrics on the system resources, I used the Telegraf tool, which had the option to directly collect operating system (and various other) metrics and push them into InfluxDB. I forgot to turn Telegraf on at first when doing this example so it is a bit short. Anyway, visualization:

socket_cpu

Here, the top row shows CPU use percentage for each of the four cores on the system. The middle row shows the average of these for total system CPU load. The bottom row is for network traffic as bytes in/out. Unfortunately this did not seem to work on OSX so it is just a flatline for now.

In any case, 2000 clients seems plenty enough to test for my app that has never seen more than a fraction of that in installs. But since I am always too curious, I wanted to try to implement this properly. So I went with Netty (as I needed both Java NIO and SSL support which is horrible to implement by yourself, thank you very much Java engineers..).

So, using the same visualizations, here are some graphs.

start_sessions

This one shows the start of the test execution. Notice the flatline part on the left when the test is started? The flatline is the measurement agents showing the server JVM statistics before the test clients are started up. Interestingly, the JVM starts by allocating about 256MB of memory out of the possible 1GB. Shortly after more load is put on the JVM, it goes down much closer to the actual usage limit.

As for the frequency measure, it is a bit off at the start which is quite normal I would guess for the first measure to calibrate itself. What is more noticeable here is that I added a second frequency measure. Now there are two overlapping lines, one for the delay observed at the server end, and one for the tester end. This gives me a way to see if the delay is due to too much load on the server or too much load on the network (and avoids need for time synchronization such as NTP or PTP when I distribute this). It also has the benefit that I can keep observing the server delay using the same agent in production.

After a longer run, this is how far I end up at:

end4k_sessions

At around 4000 client connections I hit another limit. This time I have passed the thread count limit but I hit a limit on the number of available open file descriptors on the operating system. File descriptors are also used for sockets, and the OS has a limit on how many can be opened at once. Apparently this can be modified using various tricks involving ulimit and various kernel paremeter tuning on OSX. However, I do not wish to mess my personal laptop so I give up on this and take the 4000 as enough for now. Otherwise I would take a virtual machine or several, install Ubuntu and mess them up with the file limits as a VM is easy to redo. Then make several and bombard my server with all of them. But for now this is fine.

For the Telegraf metrics we get this:

end4k_cpu

So the CPU load goes up steadily towards the end and suddenly drops at the end. Which is kind of weird as I would expect it to really drop after the test finishes, not 30 seconds before it. Either there is some issue with different ways the agents take time or there are some delays in processing some metrics. I should probably investigate further. However, this would not really affect my performance test so I am fine with that for now.

Anything else? Well, there are a few interesting bits to show. Lets see.

Here is a frequency spike that hides all other variation due to being so large:

freq_spike

We can drill down on the flatline to see the variation there much like we did in Kibana (click and drag to select on the chart):

freq_spike_focused

Of course, it is a valid question here if the variation at this very fine granularity makes a difference or if the large spike is the only thing of real interest here. But is illustrates how these fancy tools can be nice.

Another thing of interest. During some test runs, I managed to get some weird large spikes in the frequency (as well as session registration):

logspike_session

These spikes are the same for both the tester and server frequency, and the number of new sessions that get through at this time are also flatlining. So something must hang the system for close to 30 seconds here. I would suspect a some network lag here but I believe I was running both the tester and server on localhost so should not be that (although InfluxDB is on another host). Also, the fact that the server spike is identical (overlapping line) with the tester spike is against this. So a look at the CPU load for this period:

logspike_cpu

There does seem to be some correlation with the highest CPU spikes and the lag spikes in update frequencies. So I tried to think what I had changed or what could be causing this. I figured it was probably the fact that I had added some pretty verbose logging and enabled log rotation + compression. So I figured maybe logs are filling up fast and getting rotated + compressed on those points. So I reduced logging to sane levels and the spikes never came back. Maybe it was that, or maybe it was the sun spots. But its gone so fine with me..

So that is some basic example of how I used this toolset for performance testing. Now, in practice I would not run the server on localhost, nor would the client apps be running on localhost (or the same host as the server is on). So I should actually test it on the real server environment. About that next…

Serialization JSON/Avro/Protocol buffers

Introduction

Recently I have been trying to figure out nice and effective approaches for serializing my data across different nodes. Sitting back in the rocking chair, I can recall how CORBA once upon time was supposed to be cool (it was binary and generated code I believe) but somehow was always oh so broken (only tried a few times and long ago). Then there was RMI, which I recall abusing in weird ways and ending up serializing half the app for remote calls (usually not noticing). But hey, it worked. Then came all the horrible and bloated XML hype (SOAP still makes me cringe). These days JSON is all the rage instead of XML and then there is the shift back towards effective binary representations in CORBA style but doing it a bit more right with Avro/Protocol buffers/Thrift (and probably bunch of others I never heard of).

So trying to get back into a bit more effective programming approaches, I tried some of these out for two different types of applications. One is an Android app with a backend server. Both ends written in Java. The server¬†streams continuous location updates to a number of Android clients at one second interval. The second application is a collection of sensor nodes producing data to be processed in a “cloud” environment with a bunch of the trendy big data tools. Think sensor->kafka->storm/spark/influxdb/etc. Server side mainly Java (so far) and probe interfaces Python 3/Java (so far).

For the Android app, I hosted the server in Amazon EC2 and wanted to make everything as resource efficient as possible to keep the expenses cheap. For the sensor/big data platform the efficiency is required due to the volume etc. (I think there is supposed to be X number of V’s there but I am just going to ignore that, arrr..). So my experiences:

JSON:
+No need for explicit schema, very flexible to pass just about anything around in it
+Works great for heterogeneous sensor data
+Very easy to read, meaning development and debugging is easy
+Can be easily generated without any special frameworks, just add a bunch of strings together and you are done
+Extensive platform support
+Good documentation as it is so pervasive. At least for the format itself, frameworks might vary..

-Takes a lot of space as everything is a string and tag names, etc., are always repeated
-String parsing is always less efficient in performance as well than custom binaries

Avro:
+Supported on multiple platforms. Works for me both on Python3 and Java. Others platforms seem extensively supported but what do I know haven’t tried them. Anyway, works for me..
+Effective binary packaging format. Define a schema, generate code that creates and parses those binaries for you.
+Documentation is decent, even if not great. Mostly good enough for me though.
+Quite responsive mailing lists for help
+Support reflection, so I could write a generic parser to take any schema with a general structure of “header” containing a set of metadata fields and “body” a set of values, and, for example, store those in InfluxDB (or any other data store) without having to separately write deserializers for each. This seems to be also one of the design goals so it is well documented and support out-of-the-box.

-The schema definition language is JSON, which is horrible looking, overly verbose and very error prone.
-The generated Java code API is a bit verbose as well, and it generates separate classes for inner “nested records”. For example,¬†consider that I have a data type (record) called TemperatureMeasure, with an inner fields¬†called¬†“header” and “body” which are¬†object types themselves (called “records” in Avro).¬†Avro then generates separate classes for Header and Body. Annoying when having multiple data types, each with a nested Header and Body element. Need to come up with different names for each Header and Body (e.g., TemperatureHeader), or they will not work in the same namespace.
-The Python API is even more weird (or I just don’t know how to do it properly). Had to write the objects almost in a JSON format but not quite. Trying to get that right by manually writing text in a format that some unknown compiler expects somewhere is no fun.
-As a typical Apache project (or enterprise Java software in general) has a load of dependencies in the jar files (and their transitive dependencies). I gave up trying to deploy Avro as a serialization scheme for my Android app as the Android compiler kept complaining about yet another missing jar dependency all the time. Also dont need the bloat there.

Protocol buffers:
+Effective binary packaging format. Define a schema, generate code that creates and parses those binaries for you. DeJaVu.
+Documentation is very good, clear examples for the basics which is pretty much everything there is to do with this type of a framework (generate code and use it for serializing/deserializing)
+The schema definition language is customized for the purpose and much clearer than the Avro JSON mess. I also prefer the explicit control over the data structures. Makes me feel all warm and fuzzy, knowing exactly what is is going to do for me.
+Only one dependency on a Google jar. No transitive dependencies. Deploys nicely on Android with no fuss and no problems.
+Very clear Java API for creating, serializing and deserializing objects. Also keeps everything in a single generated class (for Java) so much easier to manage those headers and bodies..
+The Python API also seems much more explicit and clear than the Avro JSON style manual text I ended up writing.

-Does not support Python 3, I suppose they only use Python2.7 at Google. Makes sense to consolidate on a platform, I am sure, but does not help me here.
-No reflection support. The docs say Google never saw any need for it. Well OK, but does not work for me. These is some support mentioned under special techniques but that part is poorly documented and after a little while of trying I gave up and just used Avro for that.

Conclusions
So that was my experience there. There are posts comparing the effectiveness in how compact the stuff is etc between Avro, Protocol buffers, Thrift and so on. To me they all seemed good enough. And yes, Thrift I did not try for this exercise. Why not? Because it is poorly documented (or that was my experience) and I had what I needed in Avro + Protocol buffers. Thrift is intended to work as a full RPC stack and not just serialization so it seems to be documented that way. I just wanted the serialization part. Which, from the articles and comparisons I could find, Thrift seemed to have an even nicer schema definition language the Protocol buffers, and should have some decent platform support. But, hey if I can’t easily figure out from the docs how to do it, I will just go elsewhere (cause I am lazy in that way…).

In the end, I picked Avro for the sensor data platform and Protocol Buffers for the Android app. If PB had support for reflection and Python 3 I would much rather have picked that but it was a no go there.

Finally, something to note is that there are all sorts of weird concepts out there I ran into trying these things. I needed to stream the sensor data over Kafka and as Avro is advertised as “schemaless” or something like that, I though that would be great. Just stick the data in the pipe and it magically parses back on the other end. Well, of course no magic like that exists, so either every message needs to contain the schema or the schema needs to be identified on the other end somehow. The schema evolution stuff seems to refer to the way Avro stores the schema in files with a large set of adjoining records. There it makes sense as it is only stored once with a large set of records. But it makes no sense to send the schema with¬†every message in a stream. So I ended up prefixing every Kafka message with the schema id and using a “schema repository” at the other end to fix the format (just a hashmap really). But I did this with both Avro and PB so no difference there. However, initially I though there was going to be some impressive magic there when reading the adverts for it. As usual, I was wrong..

Sending SOAP messages in Java without any libraries

Needed to send some SOAP requests in Android. There was no nice library available to do it all. Not that I would care for them much, SOAP is bloated, Java libraries are typically bloated, and combining the two probably gets obese. Anyway, how can I send SOAP messages with only the most basic methods?

After a bunch of experiments, here is one just using plain sockets:

  public static void send(String hostname, String uri, String msg) throws Exception {
    int port = 80;

    String authString = "username:password";
    byte[] authEncBytes = Base64.getEncoder().encode(authString.getBytes());
    String authStringEnc = new String(authEncBytes);

    StringWriter sw = new StringWriter();

    Socket socket = new Socket(hostname, port);
    PrintWriter pw = new PrintWriter(sw);
    PrintWriter pw2 = new PrintWriter(new OutputStreamWriter(socket.getOutputStream()));
    pw.print("POST "+uri+" HTTP/1.1");
    pw.print("\r\n");
    pw.print("Host: " + hostname);
    pw.print("\r\n");
//    pw.print("Content-Type: application/soap+xml; charset=utf-8");
    pw.print("Content-Type: text/xml");
    pw.print("\r\n");
    pw.print("Authorization: Basic " + authStringEnc);
    pw.print("\r\n");
    int length = msg.getBytes().length;
    pw.print("Content-Length: " + length);
    pw.print("\r\n");
    pw.print("\r\n");
    pw.print(msg);
    pw.print("\r\n");
    pw.flush();

    String all = sw.getBuffer().toString();
    System.out.println("SENDING:");
    System.out.println(all);
    pw2.print(all);
    pw2.flush();

    BufferedReader reader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    for (String line ; (line = reader.readLine()) != null ; ) {
      System.out.println(line);
    }
    reader.close();
    pw.close();
    socket.close();
  }

Why all the print(“\r\n”) and not println()? because I was going nuts trying to figure out what was the problem and tried to ensure no linefeed conversion problems would be there..

Problem: At first I looked up the examples from W3C SOAP tutorials/examples. There the content-type was set to “application/soap+xml” as shown in the commented line in the example above. This kept producing “500 internal server error”, with “soap-fault” mentioned in the header. Was confusing since there seemed to be no other explanation. I had copy-pasted the response read code from somewhere and it stopped after headers.. Fixed in the above now.

Anyway, after installing Wireshark (what a pain on OSX..), I finally figured it out. There was also an error string in the response body “Transport level information does not match with SOAP Message namespace URI”. The problem is that the namespace in my SOAP message was for SOAP 1.1, and the content-type “application/soap+xml” is for SOAP 1.2. And in some combination the SOAPAction header would be required as well.. The answer? Change the content-type to “text/xml”, which makes it all match SOAP 1.1 specs. Whooppee.

Some good diffs at http://www.cnblogs.com/wangpei/p/3937541.html.

The socket version is handy to print out exactly what you are sending and to debug if the error is in some configuration of the libraries used or just in the data being sent. Since the socket version uses no libraries, it should rule that out..

Of course, running this on Android might as well use the Apache HttpClient that comes with it. Version using Apache HttpClient v4.4.1 on a desktop (yesyes port it to Android later..):

  public static void sendHTTPClient(String url, String body) throws Exception {
    CloseableHttpClient httpclient = HttpClients.createDefault();
    HttpPost post = new HttpPost(url);

    String authString = "username:password";
    byte[] authEncBytes = Base64.getEncoder().encode(authString.getBytes());
    String authStringEnc = new String(authEncBytes);

    StringEntity strEntity = new StringEntity(body, "UTF-8");
    strEntity.setContentType("text/xml");
    post.setHeader("Authorization", "Basic " + authStringEnc);
    post.setEntity(strEntity);

    HttpResponse response = httpclient.execute(post);
    HttpEntity respEntity = response.getEntity();

    System.out.println("Response:");
    System.out.println(EntityUtils.toString(respEntity));
  }

Installing JDK8 on linux (debian/ubuntu)

So how do I install JDK8 on debian etc these days when there seems to be some mess with Oracle licensing and linux package managers?

Found some nice instructions at https://www.digitalocean.com/community/tutorials/how-to-manually-install-oracle-java-on-a-debian-or-ubuntu-vps

To summarize:
Download the .tar.gz file for linux from Oracle website such as http://www.oracle.com/technetwork/java/javase/downloads, and we unzip that to some nice and shiny directory. Suggested on the above link is /opt/jdk, so do to this

first we upgrade to superuser privileges to avoid need to sudo all the time

sudo su

which apparently switches to root user (su is for switching user, with no param it is root), then we do

mkdir /opt/jdk
tar -zxf jdk-8u20-linux-x64.tar.gz -C /opt/jdk

to unpack the jdk distribution to /opt/jdk

now the above link suggests to do this

update-alternatives –install /usr/bin/java java /opt/jdk/jdk1.8.0_20/bin/java 100
update-alternatives –install /usr/bin/javac javac /opt/jdk/jdk1.8.0_20/bin/javac 100

which should install alternatives for java and javac commands with priority 100. Apparently the system should use these priorities to pick one of the alternatives over the other. This did not quite work well on my system as I found there was another version already installed with even higher priorities etc. And I would need to tune this and whatever.. So to fix it here is what I did

update-alternatives –list java

which gave me this

/opt/jdk/jdk1.8.0_20/bin/java
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java

as I did not need JDK7 the fix is simply this

update-alternatives –remove java /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java

after which “java -version” gives me jdk8 finally. and repeat for javac.. after system update this seems to re-install JDK7 so this is not necessarily a long term fix, but better to remove JDK7 completely, fix the priorities or whatever. But fixed it for me for now..

note: even if no java is installed before, the update-alternatives seems to work. whee

Was playing around a bit with AngularJS. Seems like a great way to structure your templates at client vs server-side. After playing around with Jersey and bunch of other Java REST frameworks, I finally decided to go with just a basic Servlet container and have that provide some JSON data over a REST style interface. Just too much badly document weird requirements there in the frameworks, couldn’t really figure out the benefits.. anyway

Getting the data to AngularJS is not hard. Just have the Servlet return the page with the template. Then for dynamically querying data from server in Angular I can make a call such as

    function myController($scope,$http) {
        $http.get("http://localhost:5555/hello").
                success(function(response) {
                    $scope.values = response.my_data;
                }
        );
    }

Using response.variablename I can also put the contents of the returned JSON object up for parsing by other parts of the template, or just use plain “response” for using it as is. But this only covers the simple scenario of having no parameters in the query. What if I need to perform a post with some request parameters?

I tried

    function myController($scope,$http) {
        var data = "msg="+JSON.stringify({count:10});
        $http.post("http://localhost:5555/hello", data).
                success(function(response) {
                    $scope.values = response;
                }
        );
    }

I thought this would allow me to access it in a Servlet using request.getParameter(“msg”). But this just returns null.. Seems to be because the post from AngularJS is not a HTML Form post, so the Servlet container does not parse it as I was expecting. So we need to write our own parser..

  private void showPage(HttpServletRequest req, HttpServletResponse resp) throws IOException {
    // Read from request
    StringBuilder buffer = new StringBuilder();
    BufferedReader reader = req.getReader();
    String line;
    while ((line = reader.readLine()) != null) {
      buffer.append(line);
    }
    String data = buffer.toString();

    log.debug("Latest request:"+data);

    try {
      JSONObject json = new JSONObject(data);
...

And then we have the request data in the “json” variable. And it is better now to forget the “msg=..” part from above attempt, and just send over the plain JSON.

Looking at the REST frameworks, I guess the simplest way would be to just embed the parameters in the request URL such as /hello/my_parameter. Would probably need a ServletFilter to parse all that.. Maybe someday..

programmatic jdk logging configuration

something so wrong with all this being so hard.. so lets try to document a few words

problem:how do you programmatically configure the JDK logging so you get two different log streams, one to standard output and another to a file? everything i tried was either outputting everything or nothing.

trying to search for a solution you come up with all sorts of weird stuff. some of these shortly.

logger.setUseParentHandlers(false);

well, the line above should stop the logging hierarchy from spouting stuff through the global loggers that are supposedly always there. not that this ever did anything for me, but the whole thing with the “” name for the root logger and default configurations is just weird. so I am still using the above line just to be sure but whatever.

another attempt

LogManager.getLogManager().reset();
for (Handler handler : java.util.logging.Logger.getLogger(“”).getHandlers()) {
handler.setLevel(Level.OFF);
}

so some claim this will remove any excess loggers that are in by default. did nothing for me. similarly tried to remove all handlers on startup. did nothing for me.

ok so use the consolehandler, which sort of works

logger = java.util.logging.Logger.getLogger(name);
logger.setUseParentHandlers(false);
logger.setLevel(Level.ALL);
Handler console = new ConsoleHandler();
//custom formatter
console.setFormatter(new LogFormatter());
console.setLevel(Level.INFO);
logger.addHandler(console);

So, first of all, the logger should have the level Level.ALL since it will block any messages below this level from reaching the handlers attached to it. So is is the first level of filtering. The second level are the attached handlers.

Now, with the above the loggers actually outputs the stuff and only the INFO level stuff and on the console. But on the stderr, the System.err stream. I want stdout, the System.out stream. And how do you fix that?

Lots of people suggest using

Handler console = new StreamHandler(System.out, new LogFormatter());

Which just prints some of the stuff and not others. Probably messes up the stream in some handy way. whooppee.

Finally, this works

/** This is where the log messages are written. */
private PrintStream out = System.out;
/** For formatting the log statements. */
private LogFormatter formatter = new LogFormatter();

@Override
public void publish(LogRecord record) {
if (record.getLevel().intValue() < getLevel().intValue()) {
return;
}
out.print(formatter.format(record));
}

@Override
public void flush() {
//system.out handles itself
}

@Override
public void close() throws SecurityException {
//system.out handles itself
}

That is, create your own handler with the code above and configure that to your logger. It’s magic.

As for the file stream, the FileLogHandler from the JDK works fine and you just add it similarly to the examples above.

jetty nio issues

Trying lately to get Jetty to work with Jersey on my computers. getting those “Unable to establish loopback connection” and “Connection refused” errors. Googling this indicates it is a firewall issue but I had to try to debug anyway right..

So I tried to write some simple code to test if I had problems binding on localhost due to firewall. Surely I discovered  (in Java) you can do ServerSocket.accept() and bind to a port with no problem. So I thought it had to be something else.

Turns out Jetty uses NIO which is bit different. Where it crashes it the NIO call Selector.open(). This not only tries to bind to a port or whatever, it picks one at random and then tries to connect to it. So just writing another debug program and oneliner is enough, Selector.open() fails allways for me. This is what kills Jetty as well. Turns out it is a firewall issue after all. But it does not show all the time since my laptop is configured by the nice people at the office to have different rules depending on which type of network I am in. So sometimes it blocks the loopback and other times not.

So few things to take home here:

Selector.open() binds on localhost on random port and connects to it. You need both rights to bind and connect to localhost to use Jetty at all. Seems weird to make such assumptions on basic NIO code. But this is what I have to live with.

Sometimes the firewall may block you, sometimes not. Can depend on lots of things like are you on internet, where did you connect from, etc. What is the point in blocking your own localhost from connecting to localhost anyway?

The IPv4 localhost address (127.0.0.1) in IPv6 seems to translate to 7F00. This is at least what the FW logs show me. Just a reminder after I forget it.. And IPv6 localhost address is ::1 (yes three characters). Some software works for me with this, which I guess is due to the loopback being blocked using IPv4 filters.. too bad it does not work for everything. I think it depends on how the SW binds, whether that is done using IPv6 or IPv4 address. The IPv6 version should include the IPv4 address as a subaddress, which might explain why it works for SW that binds using the IPv6 stuff.

Just saying…

 

Java wildcard classpath

Since JDK 6 you are supposed to be able to write a classpath like

java -cp * foo.Bar

which should set the classpath to contain every .jar ending file in the current directory. But if you do this, it gives you an error that main class “xxx” could not be found, where “xxx” equals the name of the first file in your directory. This is due to some strange shell wildcard expansion (at least on windows, with cygwin installed although i guess it does not matter). So to fix it, you need to set the classpath to something that is an invalid directory (since * can work in normal path names not just classpath). So for example,¬†

java -classpath .;.\* foo.Bar

works.