Talk:BF Joust

I have an idea for another variant, based on the User:ais523 version (my intention is that all the additional rules in this list would be used in the extra variant): --Zzo38 15:31, 24 May 2009 (UTC)
 * In case of a draw, figure out who would have lost using the old rule that whoever's flag is zero loses immediately (and determine the winner according to this rule, so whoever's flag was zero first loses).
 * The command , takes input from the opponent's output (as a queue) if any is available, otherwise it inputs a random byte to your program. The random bytes are kept in a list so that both programs will input the same list of random bytes whenever random bytes are input. (It also adds an additional source of randomness)
 * If both programs have ended and neither player's flag is zero, then both programs shall restart from the beginning with the tape pointer pointing to their own flag again like it was at the start, but using the current values on the tape instead of resetting them to zero.
 * Possibly a kind of betting, the players can bet after the program runs part way through, bet again later after the program runs more, and so on, with some amounts of information given to the players (some shared and some known only to individual players), until a winner is determined or one player folds. Possibly also as part of the betting process you also give some additional input to both programs (the additional input could be based on the amount you bet).
 * Cards that can be used by the players during betting in certain circumstances, with various effects. The number of cards is not too large, that you might be able to guess which cards are being held by your opponent.

The, takes input from enemy's .s thing is interesting, if weird;. is currently intended as a nop, and it seems fundamentally silly to implement a command to explicitly give your opponent information. :) I'm not yet certain which or which combinations of these would provide the most strategical depth, or even if any of them, but for more opcodes... Some combination of (while not 0, loop/set this value to the cell under my pointer:) (opponent's pointer position from (their perspective/your perspective)/value under opponent's pointer) Right now the only way to gain information is to use []s while on a cell that the opponent has touched, so implementing more opcodes would be a serious change. For fun, though, a hill could go through a rotation of implementing different opcodes, seeing which ones are the most interesting to play with and which ones break the game by making one strategy clearly dominate. Also, could add an if (condition != 0) foo; else bar; scheme as an equivalent to any loop type. Would need to experiment. --Patashu 15:13, 26 May 2009 (UTC) Another one: Set my pointer (from my perspective) to the value at this cell. From enemy's perspective is no good since you could step onto a 0 and warp onto their flag.--Patashu 01:28, 27 May 2009 (UTC)

What if there was an additional cell past each flag? Maybe it shouldn't start as a 0 but a 1 to allow for more defensive opportunities or something; the important thing is that it provides some means of noticing you found your own flag other than "oops I fell off"

[ and ] should take the value on the tape *after* the opponent has (potentially) modified it. This would enable code such as "[]+", which currently will not function because of timing. I don't think adding more opcodes would make the game better, though the lack of comparison operators severely limits what is feasible. An "age" parameter on the hill would be nice, too. --myndzi 02:49, 27 May 2009 (UTC)

Scoring
This is a description of the new proposed scoring system for the egojoust hill for your consideration.

Abstract Description
100 points is distributed uniformly to all n warriors on the hill before the tournament. Each warrior bets 1/n of its points on each match. (The remaining 1/n it keeps for itself in order to make this process aperiodic. This is what would happen if it played one match against itself anyway.) Since each match lasts 42, each warrior wagers evenly on each round, that is, 1/(42n) of its score. The winner of each round takes the points wagered. No points are exchanged for draws.

The tournament is repeated with the new score as input until the scores do not change from one round to the next.

The Algorithm
The above process is computed in a single step the following way: Run the tournament once, tracking how many times each player won in each match. Construct from this the matrix P(i,j)={the number of times i beat j}/(42n) if i!=j, and P(i,i)=1-sum(j!=i|P(i,j)). This is the transition matrix that does one step of the "losers paying winners a portion of their score" process described above.

Finally, take the eigenvector of this matrix corresponding to eigenvalue 1 (should be the largest component), and multiply it by 100/{sum of its entries} so that the final scores sum to 100.

Alternate Algorithm?
There has been some debate in #esoteric between myself and User:ehird as to which sort of fixed point scoring system to use. ehird believes that the transaction should occur at the level of a match. In other words, a program should pay out the entirety of its wager on the match if its opponent would beat it with probability > 1/2 on a random tape length and polarity. I believe it makes sense to conduct the wagers at the level of individual rounds/tilts. In other words, a program pays out a fraction of its wager on the match equal to the probability its opponent would beat it on a random tape length and polarity. Either system can be implemented as a fixed-point algorithm, so I'm posting this here for public comment.

Here is a bulleted argument why I think my system is better. I invite ehird or any who agrees with his position to argue for it, since I feel that I would poorly represent its advantages.

Under ehird's system:
 * Winning 22 rounds and tying the rest is just as good as winning 42 rounds.
 * A program that barely beats all other warriors is better than one that overwhelmingly crushes the top ten and draws with the remainder.
 * Marginal improvements of a program against certain strategies/warriors have no net benefit to its score unless they happen to push it over the hump from losing to winning.
 * In the case that a tiny change to a program does change it from losing to winning, it will cause a large change in its score. I expect this could lead to a lot of instability in rankings.
 * Since the margin of victory/loss is irrelevant to scoring, a program that defeats every other program on more than half of the lengths and polarities will automatically receive the highest possible score. Thus, no improvements to this program can result in a higher score. Likewise, a program that is defeated by every other program will receive the lowest possible score, which means that a program like "[>[+]+]" will pretty much always receive the same score (0.0) as "<" (if they are not on the hill together), even though the former is obviously better than the latter in the sense of it will sometimes actually win a round.

My system does not, I believe, have any of the above properties (which one may or may not consider flaws). However, it does have some strange properties that would need to be accepted:
 * A program near the bottom of the hill will simultaneously receive as much benefit from beating an opponent on the top of the hill in a handful of rounds as the opponent at the top of the hill receives from beating it on the remainder of the rounds. This is a necessary property of any fixed point scoring system which allows marginal improvements to affect score. (Under ehird's system, only the program at the top of the hill receives any benefit from the same match.)
 * Maximizing the number of + signs in your program's row on the report may not be the way to maximize your score, as whether or not you "beat" a particularly program is not factored into your score in any way. (I consider this a benefit, but I can see that it is not obviously so. My reasoning is that beating a program by one round should not be significantly better than drawing with it; the numbers in the two cases are practically identical.)
 * Constant-tweaking may become a bigger factor in improving a program's score, since even changes that don't turn losses into victories could potentially change one's ranking. This may incentivize the running of parameter optimization algorithms on programs to improve their scores and somewhat trivialize the metagame. On the other hand, everyone could just agree this is a pretty lame thing to do in most cases and preferentially treat those who innovate new strategies. Or, everyone could do it and we may arrive at the ranking equilibrium where parameter tweaking is no longer beneficial.

To help you decide which system you prefer, here is how the current hill would be ranked under each system:

ehird's system: http://sprunge.us/IjhS

my system: http://sprunge.us/ONOi

a third system where you don't have to win >50% of rounds, just more rounds than opponent (handles draws differently): http://sprunge.us/YDWT

--Quintopia (talk) 01:25, 17 June 2012 (UTC)