Sunday, February 18, 2018

A Fenwick bound week

Codeforces was quite active this week with two Division 1 rounds. The 462nd one took place on Wednesday (problems, results, top 5 on the left, analysis). Um_nik was the only one able to solve all problems, submitting one in the last minute — but he would've won without that one anyway. Congratulations on the great performance!

Round 463 took place on Thursday (problems, results, top 5 on the left, analysis). It was Radewoosh's turn to be the only competitor to solve all problems, with 7 minutes to spare this time. Well done!

He will be representing his university in the upcoming ICPC World Finals in Beijing, along with a few others that you can see in the screenshots on the left, so the competition there promises to be very interesting :)

Finally, the Open Cup Grand Prix of Gomel on Sunday provided another glimpse into full team performances (results, top 5 on the left). Moscow SU team Red Panda has found the winning ways again, this time thanks to being the only team to solve problem I — and doing it in the beginning of the third hour of the contest, which is quite unusual for a problem with only one solver. Congratulations!

Last week, I have mentioned a data structure problem from the Open Cup. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

One could immediately start thinking about standard approaches applicable in this problem. The fact that we need to be able to run a query against the last ai people in the queue suggests that we should probably use a balanced tree that supports quick splitting, like a treap, and maintain the size of the subtree plus some additional structures necessary to answer queries in each node. This approach can probably be made to work, but the nice thing about the problem is that we can obtain a much easier to code solution.

Since every person only ever joins the queue either in last position, or next to a person from their group, if we maintain the queue as a list of segments of people from the same group, in other words as a list of (group, counter) pairs, then the only operations we need to support is to increment a counter and to add a new group to the end of the list. Now in order to find a place for the next person we need to find the set of groups corresponding to the last ai+1 people, and then find the first group of the required type in that set.

In order to find the boundary on the group level, we need to be able to find the largest prefix for which the sum of counters does not exceed a given value. This can be done in O(logn) by maintaining the counters in a Fenwick tree, I will give more details below.

And in order to find the first group of the required kind after the given one, we can simply maintain a vector of indices of groups for each kind. Since we only ever create new groups at the end, these indices never change, and a simple binary search can find the first group of the given kind after a given index.

Finally, after we know what happens with groups we know at which position does each person enter the queue, but we still need to output the final order. This can be done using the same Fenwick tree with "largest prefix with sum not exceeding x" operation we already used: we go from the end and maintain the free spots in the Fenwick tree.

So instead of a balanced tree with extra information in nodes, we've managed to get by using two very easy to code data structures: a Fenwick tree, and a vector. The solution is O(nlogn), and the constant hidden in O() is also really small.

This solution used the following non-standard operation on a Fenwick tree: find the largest prefix with sum not exceeding x. A naive implementation using binary search and Fenwick prefix sums would give O(log2n) complexity, which would most likely still be fast enough to get the problem accepted. However, tgehr has pointed out to me that the standard Fenwick tree allows an extremely simple O(logn) way to answer this question.

Suppose our Fenwick tree has n elements, and k is the largest power of 2 not exceeding n. The (2k-1)-th element of the Fenwick tree array contains the sum s of elements on the segment [0;2k-1], so by comparing x with just this number we know if our answer is below or above 2k. Let's suppose it's above 2k. Then we notice that (2k+2k-1-1)-th element of the Fenwick tree array contains the sum of elements on the segment [2k;2k+2k-1-1], so by comparing x-s with just this number we know if our answer is below or above 2k+2k-1. We continue traversing the powers of two in the same manner, and it turns out that the standard Fenwick tree stores exactly the sums needed to execute this binary search! Here's the actual code:

private int upperBound(int[] f, int x) {
    int res = 0;
    int max = Integer.numberOfTrailingZeros(
        Integer.highestOneBit(f.length));
    for (int k = max; k >= 0; --k) {
        int p = res + (1 << k) - 1;
        if (p < f.length && f[p] <= x) {
            x -= f[p];
            res += 1 << k;
        }
    }
    return res;
}

Thanks for reading, and check back next week.

Monday, February 12, 2018

A leafy week

This week featured two weekend contests. First, TopCoder SRM 729 took place on Saturday (problems, results, top 5 on the left, my screencast with commentary). If you sum up the problem columns in the screenshot on the left, you can notice that the sum doesn't match the score column, and that's because the match presented ample opportunities for challenging. You can see on the screencast as I try to prepare an uber-challenge-case for the 450 during the intermission, and spent the beginning of the challenge phase getting it to work, while many solutions were already being challenged. uwi has made the best use of the challenge opportunities and thus claimed the first place. Well done!

Most of the challenge opportunities presented themselves in the medium problem, which looked very standard at first glance. You are given a 1000x1000 grid. In one jump, you can move from a cell to any other cell that's at least the given distance d away — you can't jump very close. What is the smallest number of jumps required to get from one given cell to another given cell?

We could use a standard breadth-first search to solve this problem, but we have 1 million cells and potentially 1 million jumps from each cell, so the total running time would be on the order of 1012 which is too slow. Can you see at least one way to speed this solution up? (there are many!)

The other weekend contest was the Open Cup 2017-18 Grand Prix of Khamovniki (results, top 5 on the left). Unlike the previous round, the active ICPC teams from Russia were not at the top of the standings, with only two ICPC teams from Asia and a veteran team Past Glory able to solve 10 problems — congratulations, and especially to Seoul National U 2 team for another Open Cup victory!

Problem D "Lunch Queue" was from the rare species of data structure problems that I enjoyed solving. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

In my previous summary, I have mentioned a hard AtCoder problem: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

I did not solve the problem myself, so I will describe the solution from the official analysis. First, we can notice that given any leaf, we can put the correct value into it in <=n operations (first pull it up to the root, then put it into the leaf in one rotation, and then never touch this leaf again, so we could sort everything in n2 operations, which is too much.

However, we can notice that in this approach we have quite a lot of freedom for choosing the moves that pull the given number up. We could try to use this freedom to try to pull up the numbers for all leaves, not just for one leaf. But even that would not be enough, as when our rooted tree is a chain it only has one leaf. However, we know that for a chain a simple solution is possible, called the normal insertion sort: we'll do exactly n operations, and after k operations we'd have k bottom numbers of the chain already sorted in correct order, in each turn inserting the new number to the appropriate place.

Now we need to combine the ideas of pulling the required number from anywhere in the tree with the idea of filling a chain in one O(n) operation stage in such a way that the number of stages would be O(log(n)) for any rooted tree. More precisely, we will find all leaf chains in the tree — chains that hang at the bottom with nothing else attached to them, and fill them all with correct values in one stage. This guarantees O(log(n)) stages since the number of leaves is divided by at least two after each stage.

Whenever the root of the tree has a useful value — one that must be in one of the leaf chains — we will send it there, inserting it into the correct place relative to other values that we've sent to the same leaf chain, just like the insertion sort approach above. And when the root has a useless value, we need to send it somewhere, so let's send it to any position which contains a useful number, but such that all its subtree contains either only useless numbers, or useful numbers that are already placed into the corresponding leaf chain (in other words, the numbers that we don't need to get to the root anymore). This will push this useful number towards the root, and create a subtree that doesn't have any numbers that we want to get to the root.

Why does this cycle finish eventually, and more importantly why does it run in O(n) operations? Each useful value passes through the root only once. A useless value, after passing through the root, ends up being in a dormant subtree, and would never pass through the root again, unless we need to touch this subtree because we're putting a useful value into it. In each such operation, at most one value moves from a dormant subtree back into circulation, and thus can reach the root again. So the total number of times a useless value that we've already seen once returns to the root does not exceed the total number of times we put a useful value into its place, meaning that the total number of operations for one stage is at most n plus total size of leaf paths, so O(n).

Thanks for reading, and check back next week!

Saturday, February 10, 2018

A Red Panda week

Last week started with Codeforces Round 459 on Monday (problems, results, top 5 on the left, analysis). Once again the hardest problem was too much to handle, and OO0OOO00O0OOO0O00OOO0OO was quite a lot faster on the first four. Congratulations on the win!

On Saturday, AtCoder hosted an unusually long 5-hour contest called AtCoder Petrozavodsk Contest 001 (problems, results, top 5 on the left, my screencast, analysis). There was a huge difficulty gap between the first six problems, all of which were solved by 52 contestants, and the last four. Only two contestants have managed to solve two of the more difficult kind. Congratulations to Marcin and vepifanov on this amazing feat!

The most approachable problem of the difficult kind, problem H, went like this: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

Finally, the Open Cup season came back from the winter break on Sunday with the Grand Prix of Korea (results, top 5 on the left). Moscow State University team Red Panda have won the contest by a whole problem ahead of other super strong university and veteran teams. Congratulations on the victory!

Since the problems for this contest came from Korea, I can't help but wonder if top Korean/Asian ICPC teams have also solved this problemset elsewhere, or will do that later. Does anybody have some scoreboards to share?

Last week I have mentioned a 250-point TopCoder problem: you are given 50 integers, each up to a billion. In one step, you can take any integer that is greater than 1 and divide it by 2. If it was odd, you can choose whether to round up or down. What is the smallest number of steps required to make all integers equal?

My thinking went like this: instead of deciding immediately whether we'd round up or down, let's keep a set of possible values for each integer, which will always be a segment [min;max]. After dividing some numbers by 2 a few times, we'll have such segment for each number. Whenever these segments all have a non-empty intersection, we're done.

For example, if our initial numbers are 3, 5, and 13, then we start with segments [3;3], [5;5] and [13;13]. After dividing the last number by 2 we have [3;3], [5;5] and [6;7]. After doing it again we have [3;3], [5;5] and [3;4]. After dividing the second number by 2 we have [3;3], [2;3] and [3;4]. Now all our segments intersect in point 3, so we can make all numbers equal to 3.

The only remaining thing is to decide which numbers to divide by 2, but it actually flows naturally from the above observation. What does it mean when all segments do not have an intersection? It means that there are two segments [a;b] and [c;d] that do not intersect (this is not true for arbitrary sets, but it is true for segments), without loss of generality we have b<c. But then we must divide the segment [c;d] by 2, otherwise it will never intersect with [a;b] or its descendants. So as long as we don't have an intersection, we can always find a segment to divide by 2 this way.

Thanks for reading, and check back in a couple of days for this week's summary!

Sunday, January 28, 2018

A TopCoder week

TopCoder SRM 728 was this week's contest (problems, results, top 5 on the left, analysis, my screencast). The hard problem was a relatively straightforward application of the Burnside's lemma, which I described in this blog more than 9 (!) years ago. The easy problem allowed a ton of completely different correct approaches, and thus in a sense was more interesting: you are given 50 integers, each up to a billion. In one step, you can take any integer that is greater than 1 and divide it by 2. If it was odd, you can choose whether to round up or down. What is the smallest number of steps required to make all integers equal?

I've also just realized that I forgot to cover the previous SRM, TopCoder SRM 727 which took place two weeks ago very early in the morning (problems, results, top 5 on the left, analysis). lych_cys, participating in Division 1 for the first time, has squeezed the victory during the challenge phase from wxh010910, who was doing only his third Division 1 round himself, while the rest of the pack was more than 100 points behind. Congratulations to the winning newcomers!

Thanks for reading, and check back next week!

Monday, January 22, 2018

A mean median week

This week Codeforces hosted its Round 458, also known as CodeCraft-18 (problems, results, top 5 on the left, analysis). Syloviaely has made a solid claim to the first place by submitting the 3000-point problem F just 13 minutes after the contest started (was it a previously known problem, or was it really fast solving?..), but SkyDec has gathered a whopping 800 points from challenges on two different problems, 450 on B and 350 on C, and thus finished first with quite a margin. Congratulations!

This round also had a great side story — the top-rated contestant, Um_nik, decided to go for the hardest problem H, and that bet did not play out well as he spent too much time on it and as I understood also discovered an incorrect output in one of the samples (which means an incorrect reference solution?..). The admins have decided to make the round unrated for him because of the incorrect sample issue, but he asked them to reverse that decision (and thus make him lose rating) because he deemed the issue did not impact him significantly. Kudos to Um_nik for the admirable honesty!

Last week, I have mentioned an AtCoder problem: you are given n<=2000 positive integers, each up to a<=2000. Consider all 2n-1 nonempty subsets, and compute the sum of the numbers in each subset. Now sort those sums, and find the median sum — the 2n-1-th in the sorted order. You need to output this median sum. Time limit is 2 seconds.

At first sight, the problem appears somewhat standard: we can run the knapsack dynamic programming that computes the number of ways to get each possible sum, and then go from lower sums to higher sums until we get 2n-1 in total.

This has two issues, though: first, the maximum sum is n*a, so the dynamic programming runs in O(n2*a). Second, the number ways to get each sum can be on the order of 2n, so we need big integers, and the running time after accounting for that is more like O(n3*a). Even O(n2*a) is too slow.

Here comes the insight: notice that all sums can be split in pairs. If the sum of all numbers is s, then for each subset with sum x we'll have the complementary subset with sum s-x (let's also include the empty subset). One of those sums is <=s/2, and the other is >=s/2, so the median is roughly equal to s/2. More precisely, the median will always be the smallest sum that is >=s/2.

This allows us to get rid of the big integers, as now we merely need to know which sums are possible, and don't care about the number of ways anymore. Moreover, we can use bitmasks to speed up this O(n2*a) solution ~32-64 times which makes it fast enough.

Thanks for reading, and check back next week!

Monday, January 15, 2018

An Um_nik week

This week's contests shared one of the problemsetters — and the winner. On Monday, Codeforces hosted its Hello 2018 round (problems, results, top 5 on the left, analysis). With the hardest problem having too much piecewise polynomial integration, the competition focused on the remaining seven. Only Um_nik could get them all, and even with 24 minutes to spare — congratulations!

On Sunday, AtCoder Grand Contest 020 started the race to Japan (problems, results, top 5 on the left, my screencast, analysis). Once again the hardest problem by tourist was a tiny bit too hard (and I even solved it after the contest by integrating polynomials, probably inspired by the analysis of the previous contest), and once again Um_nik was way faster than everybody else on the remaining problems — big congratulations again!

I found problem C quite cute. You are given n<=2000 positive integers, each up to 2000. Consider all 2n-1 nonempty subsets, and compute the sum of the numbers in each subset. Now sort those sums, and find the median sum — the 2n-1-th in the sorted order. You need to output this median sum. Time limit is 2 seconds.

In my last post I've discussed what was arguably the hardest contest problem of 2017: you are given an undirected graph with at most 200000 vertices and edges. Your goal is to split its vertices into four disjoint groups A, B, C, D in such a way that:
  • The subgraph formed by groups A+B is connected and has at least 2 vertices.
  • The subgraph formed by groups C+D is connected and has at least 2 vertices.
  • Each vertex in A has an edge to each vertex in C.
  • There are no other edges between subgraphs A+B and C+D.
If there are many possible solutions, you need to output any one of them.

Thanks to great ideas and pointers by Swistakk, mnbvmar and lewin, we were able to put together a O(n*logn) solution — you can check it out in this comment thread.

Thanks for reading, and check back next week!

Monday, January 8, 2018

An Otherland week

The first week of 2018 was quite calm, with the most devoted taking stabs at the still-ongoing Prime New Year contest by snarknews. Only one problem remains unsolved there, and it looks like it might take some team effort to crack it. I think it's within the spirit of the contest to discuss its problems publicly (after all, many even have analysis posted online), so I propose to do just that. Please skip the following paragraphs if you don't want spoilers!

The problem goes like this: you are given an undirected graph with at most 200000 vertices and edges. Your goal is to split its vertices into four disjoint groups A, B, C, D in such a way that:
  • The subgraph formed by groups A+B is connected and has at least 2 vertices.
  • The subgraph formed by groups C+D is connected and has at least 2 vertices.
  • Each vertex in A has an edge to each vertex in C.
  • There are no other edges between subgraphs A+B and C+D.
If there are many possible solutions, you need to output any one of them.

I have been thinking about the problem for some time, but only have the following not very helpful observations:
  • If the graph has an articulation point, then we can take this point as set A, any of the components it separates as set B, and the rest as C+D. So we can assume the graph is vertex-biconnected.
  • The answer is not always "Yes" - the smallest counterexample is the graph with 4 vertices and 5 edges. This example also shows that if a biconnected graph has a vertex of degree 2, then this vertex and both adjacent vertices must be in the same half of our splitting, since both ends of a splitting edge must have degree of at least 3 (two splitting edges plus one inside the half).
  • There are also counterexamples without vertices of degree 2.
  • There might be some hashing trick involved. I.e., for example suppose we assign a random number ai to each vertex, and compute for each vertex the value sum(ai-aj), where the summation goes over all adjacent vertices j. Then if we add up those values within one half of the graph (A+B in the above terminology), then almost everything will cancel out, leaving us |C|*aA-|A|*aC, where aA is the sum of ain A. But where to go from here?..
Please share your ideas!

In my last post, I have held a vote for the best problem of 2017. And the winner is this AtCoder problem by maroonrk: you are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible. The solution is explained in this post. Congratulations to the problemsetter and to AtCoder, and thanks to all problemsetters of 2017!

Thanks for reading, and check back next week.