FRESH

Hacker News

Java is fast, code might not be

148 points by siegers

by liampulles

11 subcomments

Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.
In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)
I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.

by layer8

0 subcomment

For #5, the “fix” [0] is incomplete, because you will still get a NumberFormatException when the value is out of range. For int, you could check if there are more or less than 10 digits, and use parseLong() when there are exactly 10 digits. For long, you can use BigInteger when there are exactly 18 digits. After skipping any leading zeros, of course. Or you could just replicate the JDK’s parsing implementation and change the part where it throws NumberFormatException (at the possible cost of foregoing JIT intrinsics).
A second bug is that Character.isDigit() returns true for non-ASCII Unicode digits as well, while Integer.parseInt() only supports ASCII digits.
Another bug is that the code will fail on the input string "-".
Lastly, using value.isBlank() is a pessimization over value.isEmpty() (or just checking value.length(), which is read anyway in the next line), given that the loop would break on the first blank character.
[0]
```
     public int parseOrDefault(String value, int defaultValue) {
        if (value == null || value.isBlank()) return defaultValue;
        for (int i = 0; i < value.length(); i++) {
            char c = value.charAt(i);
            if (i == 0 && c == '-') continue;
            if (!Character.isDigit(c)) return defaultValue;
        }
        return Integer.parseInt(value);
    }
```

by spankalee

3 subcomments

Avoiding Java's string footguns is an interesting problem in programming languages design.
The String.format() problem is most immediately a bad compiler and bad implementation, IMO. It's not difficult to special-case literal strings as the first argument, do parsing at compile time, and pass in a structured representation. The method could also do runtime caching. Even a very small LRU cache would fix a lot of common cases. At the very least they should let you make a formatter from a specific format string and reuse it, like you can with regexes, to explicitly opt into better performance.
But ultimately the string templates proposal should come back and fix this at the language level. Better syntax and guaranteed compile-time construction of the template. The language should help the developer do the fast thing.
String concatenation is a little trickier. In a JIT'ed language you have a lot of options for making a hierarchy of string implementations that optimize different usage patterns, and still be fast - and what you really want for concatenation is a RopeString, like JS VMs have, that simply references the other strings. The issue is that you don't want virtual calls for hot-path string method calls.
Java chose a single final class so all calls are direct. But they should have been able to have a very small sealed class hierarchy where most methods are final and directly callable, and the virtual methods for accessing storage are devirtualized in optimized methods that only ever see one or two classes through a call site.
To me, that's a small complexity cost to make common string patterns fast, instead of requiring StringBuilder.

by cmovq

9 subcomments

When you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.
I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?

by EricRiese

0 subcomment

This is a Spring specific gripe and I know this blog post doesn't assume Spring, but I hate seeing `new ObjectMapper()`. Spring Boot auto configures an ObjectMapper for you and you probably want the customization it gives you, including `java.time` handling and classpath scanning. I've wrestled with so many bugs caused by not using the `ObjectMapper` bean.

by cogman10

2 subcomments

Nitpick just because.
Orders by hour could be made faster. The issue with it is it's using a map when an array works both faster and just fine.
On top of that, the map boxes the "hour" which is undesirable.
This is how I'd write it
```
    long[] ordersByHour = new long[24];
    var deafultTimezone = ZoneId.systemDefault();
    for (Order order : orders) {
        int hour = order.timestamp().atZone(deafultTimezone).getHour();
        ordersByHour[hour]++;
    }
```
If you know the bound of an array, it's not large, and you are directly indexing in it, you really can't do any better performance wise.
It's also not less readable, just less familiar as Java devs don't tend to use arrays that much.

by kyrra

11 subcomments

First request latency also can really suck in Java before hotpathed code gets through the C2 compiler. You can warm up hotpaths by running that code during startup, but it's really annoying having to do that. Using C++, Go, or Rust gets you around that problem without having to jump through the hoops of code path warmup.
I wish Java had a proper compiler.

by kpw94

2 subcomments

The Autoboxing example imo is a case of "Java isn't so fast". Why can't this be optimized behind the scenes by the compiler ?
Rest of advice is great: things compilers can't really catch but a good code reviewer should point out.

by wood_spirit

2 subcomments

A subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.
And aside from algorithms, it usually comes down to avoiding memory allocations.
I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.
It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.
But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.

by Okx

2 subcomments

The code:

  public int parseOrDefault(String value, int defaultValue) {
      if (value == null || value.isBlank()) return defaultValue;
      for (int i = 0; i < value.length(); i++) {
          char c = value.charAt(i);
          if (i == 0 && c == '-') continue;
          if (!Character.isDigit(c)) return defaultValue;
      }
      return Integer.parseInt(value);
  }

Is probably worse than Integer.parseInt alone, since it can still throw NumberFormatExceptions for values that overflow (which is no longer handled!). Would maybe fix that. Unfortunately this is a major flaw in the Java standard library; parsing numbers shouldn't throw expensive exceptions.

by seu

1 subcomments

I'm a bit surprised to see those examples, because there's nothing really new here. These are typical beginner pitfalls and have been there for at least a decade or more. Or maybe it's because I learned java in the late 90s and later used it for J2ME, and then using things like StringBuilder (StringBuffer in the old days) were almost mandatory, and you would be very careful trying to avoid unnecessary object allocations.

by titzer

1 subcomments

For fillInStackTrace, another trick is to define your own Exception subclass and override the method to be empty. I learned this trick 15+ years ago.
It doesn't excuse the "use exceptions for control flow" anti-pattern, but it is a quick patch.

by sgbeal

0 subcomment

Slight correction:
> StringBuilder works off a single mutable character buffer. One allocation.
It's one allocation to instantiate the builder and _any_ number of allocations after that (noting that it's optimized to reduce allocations, so it's not allocating on every append() unless they're huge).

by zahlman

0 subcomment

> Accidental O(n²) with Streams Inside Loops
Man that code looks awful. Really reminds me of why I drifted away from Java over time. Not just the algorithm, of course; the repetitiveness, the hoops that you have to jump through in order to do pretty "stream processing"... and then it's not even an FP algorithm in the end, either way!
Honestly the only time I can imagine the "process the whole [collection] per iteration" thing coming up is where either you really do need to compare (or at least really are intentionally comparing) each element to each other element, or else this exact problem of building a histogram. And for the latter I honestly haven't seen people fully fall into this trap very often. More commonly people will try to iterate over the possible buckets (here, hour values), sometimes with a first pass to figure out what those might be. That's still extra work, but at least it's O(kn) instead of O(n^2).
You can do this sort of thing in an elegant, "functional" looking way if you sort the data first and then group it by the same key. That first pass is O(n lg n) if you use a classical sort; making a histogram like this in the first place is basically equivalent to radix sort, but it's nice to not have to write that yourself. I just want to show off what it can look like e.g. in Python:
```
  def local_hour(order):
      return datetime.datetime.fromtimestamp(order.timestamp).hour

  groups = itertools.groupby(sorted(orders, key=local_hour), key=local_hour)
  orders_by_hour = {hour: len(list(orders)) for (hour, orders) in groups}
```
Anyway, overall I feel like these kinds of things are mostly done by people who don't need to have the problem explained, who have simply been lazy or careless and simply need to be made to look in the right place to see the problem. Cf. Dan Luu's anecdotes https://danluu.com/algorithms-interviews/ , and I can't seem to find it right now but the story about saving a company millions of dollars finding Java code that was IIRC resizing an array one element at a time.
(Another edit: originally I missed that the code was only trying to count the number of orders in each hour, rather than collecting them. I fixed the code above, but the discussion makes less sense for the simplified problem. In Python we can do this with `collections.Counter`, but it wouldn't be unreasonable to tally things up in a pre-allocated `counts_by_hour = [0] * 24` either.)
----
Edit:
> String.format() came in last in every category. It has to... StringBuilder was consistently the fastest. The fix: [code not using StringBuilder]... Use String.format() for the numeric formatting where you need it, and let the compiler optimize the rest. Or just use a StringBuilder if you need full control.
Yeah, this is confused in a way that I find fairly typical of LLM output. The attitude towards `String.format` is just plain inconsistent. And there's no acknowledgment of how multiple `+`s in a line get optimized behind the scenes. And the "fix" still uses `String.format` to format the floating-point value, and there's no investigation of what that does to performance or whether it can be avoided.

by hiyer

1 subcomments

I ran into 5 and 7 in a Flink app recently - was parsing a timestamp as a number first and then falling back to iso8601 string, which is what it was. The flamegraph showed 10% for the exception handling bit. While fixing that, also found repeated creation of datetimeformatter. Both were not in loops, but both were being done for every event, for 10s of 1000s of events every second.

by zvqcMMV6Zcr

2 subcomments

> Exceptions for Control Flow
This one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.

by Izkata

1 subcomments

"Java is slow" is a reputation it earned in the 90s/2000s because the JVM startup (at least on Windows) was extremely slow, like several seconds, with a Java-branded splash screen during that time. Even non-technical people made the association.

by jerf

0 subcomment

Any non-trivial program that has never had an optimizer run on it has a minimal-effort 50+% speedup in it.

by uraura

0 subcomment

I thought those were common sense until I worked on a program written by my colleague recently.

by comrade1234

0 subcomment

Also finding the right garbage collector and settings that works best for your project can help a lot.

by taspeotis

1 subcomments

Knock Knock
Who’s there?
long pause
Java

by jandrewrogers

2 subcomments

You can write many of the bad examples in the article in any language. It is just far more common to see them in Java code than some other languages.
Java is only fast-ish even on its best day. The more typical performance is much worse because the culture around the language usually doesn't consider performance or efficiency to be a priority. Historically it was even a bit hostile to it.

by ww520

0 subcomment

The autoboxing in a loop case can be handled by the compiler.

by bearjaws

4 subcomments

JavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.
Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.
At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.

by latchkey

0 subcomment

When they say that AI will replace programmers, I think of this article and come to terms with my own job security.
Most of this stuff is just central knowledge of the language that you pick up over time. Certainly, AI can also pick this stuff up instantly, but will it always pick the most efficient path when generating code for you?
Probably not, until we get benchmarks into the hot path of our test suite. That is something someone should work on.

by victor106

0 subcomment

this is great, so practical!!!
any other resources like this?

by spwa4

0 subcomment

Java IS fast. The time between deciding to use Java and Oracle's lawyers breaking down your door is measured in just weeks these days.

by abitabovebytes

0 subcomment

[dead]

by null-phnix

0 subcomment

[dead]

by ryguz

0 subcomment

[dead]

by andrewmcwatters

0 subcomment

[dead]

by r_lee

0 subcomment

[flagged]

by tripple6

1 subcomments

Do good, don't do bad. Okay.

by koakuma-chan

7 subcomments

As much as I love Java, everybody should just be using Rust. That way you are actually in control, know what's going on, etc. Another reason specifically against Java is that the tooling, both Maven and Gradle, still stucks.