15
\$\begingroup\$

When code golfing, it can be useful to write numbers succinctly. This challenge involves using a source language to generate code for a target language. For the submission to be valid, it must generate 1,000,001 snippets of code for the target language which evaluate to the integers 0, 1, ..., 1000000 (when evaluated in the source language). Target languages unable to exactly express these constants are excluded from this challenge.

This is meta-golf, your score is the sum of the number of bytes in each snippet. In particular, the length of the source code provided does not affect your score. Feel free to format as you prefer. Similarly, the way you delimit the code snippets is flexible. You can return an array, print one on each line, etc., as long as it's clear which is which.

Because each target language has different rules, submissions only compete with other submissions using the same target language.

Results

Note that different target languages are not competing with one another; I've arranged them by size only for interest.

Target language Bytes Contributor
(best possible non-byte-based language) 2,491,449 N/A
(best possible byte-based language) 2,933,952 N/A
!@#$%^&*()_+, 3,932,313 Fmbalbuena
Vyxal 3 4,791,348 pacman256
Charcoal 4,982,032 Neil
JavaScript 5,887,526 Arnauld
C++ 5,887,608 Toby Speight
(various - baseline) 5,888,897 Charles
Brainfuck 500,000,500,000 Charles

Note: No byte-based target language can have a score better than 2,933,952. A target language constructed information-theoretically perfectly for this challenge would use log(1000001)/log(2) bits for each number for a total of just under 2,491,449 bytes.

Note: If you don't need a source language, just submit your answer as, e.g., "Text → C" and note the delimiter used between programs. But you're guaranteed to need megabytes for this.

\$\endgroup\$
8
  • 1
    \$\begingroup\$ Does type matter? For example 1e3 will evaluate to 1000.00000 in many languages (and there may be a tiny error in the 10th or 20th significant figure.) Do such floating point numbers count as integers? \$\endgroup\$ Commented Sep 12 at 18:45
  • \$\begingroup\$ @LevelRiverSt Good question! I was imagining keeping it to exact integers, so 1e3 is out but floor(1e3) is ok. 1e5\1 is still a win in PARI/GP. \$\endgroup\$ Commented Sep 12 at 21:09
  • \$\begingroup\$ Python => !@#$%^&*()_+, byte amount unknown. Probably the most optimal unless there's another language that prints the characters as numbers. \$\endgroup\$ Commented Sep 13 at 0:42
  • 1
    \$\begingroup\$ @LevelRiverSt: In C and C++, 1e3 is exactly 1000.0, no rounding error allowed. It's a compile-time constant, and if the exact value is representable, must have that value, else rounded up or down to nearest representable value in an implementation-defined manner. (Code golf answers get to pick any real implementation). It's not a call to an imprecise pow(10, 3) function. An implicit conversion of 1e3 back to int will produce 1000, so if the target language is C, you should be good to use scientific notation in numeric literals. \$\endgroup\$ Commented Sep 13 at 21:16
  • 1
    \$\begingroup\$ Are binary languages interesting for this? Most of my code-golf answers are in x86 machine code. With that as the target language, an array of 1M+1 int32_t values takes 4 bytes each, so 4000004 bytes. Or with packed _BitInt(24) still aligned at byte boundaries, 3000003. (log2(1M) = 19.93 rounds up to 20 bits per number fitting, and as a bonus that's at least nibble-aligned for 2500002.5 bytes) Maybe a varint binary representation, but probably not since most of the numbers are large. I notice you phrased it as 1M+1 separate snippets, so a delta encoding isn't valid. \$\endgroup\$ Commented Sep 13 at 21:30

9 Answers 9

5
\$\begingroup\$

Vyxal 3 -> Vyxal 3, 4868748 4791344 bytes

"①②③④⑤⑥⑦"f
"k0k1k2k3k4k①k②k③k④k⑤k⑥k⑧k⑦"²
"17E18E19E"³
Wfµᴥ]:"›‹⨥⨪d½T":2ℭ$fJX“J
⑦ k1Rƛ#c'²+]
⑤ʀ¨'³+
"kn⍢⌊""k5⍢⌊""ko⌊ᐕ"
9ƛ'!+]
"5 7*5 8*6 7*8 6*9 6*7:*"4Ϣ
Wf
⑴ᴥ☷ƛµL⎋]
ᏐUVk⑧Rƛ#c"dT"f+]ʎᴥk⑧>]fJ
ʎL5<]ʎᴥk4≤]ʎᴥ⎂L<]
ʎᴥ#c⎂L<]
Þ£
k4ƛÞw:ᴥn=:a[▲h|k1<[n|n#c]

Vyxal It Online!

Replace the last line with :ᴥ#c⎂Ŀ-∑ to see how many bytes you save compared to just compressing the numbers since it takes too long to actually print them all. This currently saves 77466 bytes over raw compression

Uses the built in compression for most numbers, except when a number is n+1, n+2, n-1, n-2, n^2, n^3, 2n, 3n, 2^n, or n/2 for any built in constant, or number less than 65536. also throw in 67890, 5,6,7,8,9 factorial, and cubes. This lets me usually get a 4 byte answer modifying a compressed two byte number compared to the 5 bytes needed to compress any number larger than 2^16. Anything more than 4 bytes is the same or worse than base 252 compression. I spent like 2 hours doing this and I had to run it all on my laptop because the server crashed once and took forever to run on there. Will output an array of strings where each the contents of each string can be directly pasted into the vyxal interpreter and when run will return the desired number.

Edit I forgot to include some values in the range in line 5 its about double the savings lol

old explanation:

"①②③④⑤⑥⑦"f
"k0k1k2k3k①k②k③k④k⑤k⑥k⑧k⑦"²
"17E18E19E"³
Wfµᴥ]:"›‹⨥⨪dT½":2ℭ$fJX“J⑴ᴥ☷ƛµL⎋]⑴ᴥu
ᏐUVk⑧Rƛ#c"dT"f+]ʎᴥk⑧>]fJ
⑦ k1Rƛ#c'²+]Ju
ʎL5<]ʎᴥ#c⎂L<]ʎᴥk4<]Þ£
k4ƛÞw:ᴥn=:a[▲h|k1<[n|n#c]
­⁡​‎‎⁢⁠⁡‏⁠‎⁢⁠⁢‏⁠‎⁢⁠⁣‏⁠‎⁢⁠⁤‏⁠‎⁢⁠⁢⁡‏⁠‎⁢⁠⁢⁢‏⁠‎⁢⁠⁢⁣‏⁠‎⁢⁠⁢⁤‏⁠‎⁢⁠⁣⁡‏⁠‎⁢⁠⁣⁢‏‏​⁡⁠⁡‌⁢​‎‎⁣⁠⁡‏⁠‎⁣⁠⁢‏⁠‎⁣⁠⁣‏⁠‎⁣⁠⁤‏⁠‎⁣⁠⁢⁡‏⁠‎⁣⁠⁢⁢‏⁠‎⁣⁠⁢⁣‏⁠‎⁣⁠⁢⁤‏⁠‎⁣⁠⁣⁡‏⁠‎⁣⁠⁣⁢‏⁠‎⁣⁠⁣⁣‏⁠‎⁣⁠⁣⁤‏⁠‎⁣⁠⁤⁡‏⁠‎⁣⁠⁤⁢‏⁠‎⁣⁠⁤⁣‏⁠‎⁣⁠⁤⁤‏⁠‎⁣⁠⁢⁡⁡‏⁠‎⁣⁠⁢⁡⁢‏⁠‎⁣⁠⁢⁡⁣‏⁠‎⁣⁠⁢⁡⁤‏⁠‎⁣⁠⁢⁢⁡‏⁠‎⁣⁠⁢⁢⁢‏⁠‎⁣⁠⁢⁢⁣‏⁠‎⁣⁠⁢⁢⁤‏⁠‎⁣⁠⁢⁣⁡‏⁠‎⁣⁠⁢⁣⁢‏⁠‎⁣⁠⁢⁣⁣‏‏​⁡⁠⁡‌⁣​‎‎⁤⁠⁡‏⁠‎⁤⁠⁢‏⁠‎⁤⁠⁣‏⁠‎⁤⁠⁤‏⁠‎⁤⁠⁢⁡‏⁠‎⁤⁠⁢⁢‏⁠‎⁤⁠⁢⁣‏⁠‎⁤⁠⁢⁤‏⁠‎⁤⁠⁣⁡‏⁠‎⁤⁠⁣⁢‏⁠‎⁤⁠⁣⁣‏⁠‎⁤⁠⁣⁤‏‏​⁡⁠⁡‌⁤​‎‎⁢⁡⁠⁡‏⁠‎⁢⁡⁠⁢‏⁠‎⁢⁡⁠⁣‏⁠‎⁢⁡⁠⁤‏⁠‎⁢⁡⁠⁢⁡‏⁠‎⁢⁡⁠⁢⁢‏‏​⁡⁠⁡‌⁢⁡​‎⁠‎⁢⁡⁠⁢⁣‏⁠‎⁢⁡⁠⁢⁤‏⁠‎⁢⁡⁠⁣⁡‏⁠‎⁢⁡⁠⁣⁢‏⁠‎⁢⁡⁠⁣⁣‏⁠‎⁢⁡⁠⁣⁤‏⁠‎⁢⁡⁠⁤⁡‏⁠‎⁢⁡⁠⁤⁢‏⁠‎⁢⁡⁠⁤⁣‏⁠‎⁢⁡⁠⁤⁤‏‏​⁡⁠⁡‌⁢⁢​‎⁠‎⁢⁡⁠⁢⁡⁡‏⁠‎⁢⁡⁠⁢⁡⁢‏⁠‎⁢⁡⁠⁢⁡⁣‏⁠‎⁢⁡⁠⁢⁡⁤‏⁠‎⁢⁡⁠⁢⁢⁡‏‏​⁡⁠⁡‌⁢⁣​‎‎⁢⁡⁠⁢⁢⁢‏⁠‎⁢⁡⁠⁢⁢⁣‏⁠‎⁢⁡⁠⁢⁢⁤‏‏​⁡⁠⁡‌⁢⁤​‎‏​⁢⁠⁡‌⁣⁡​‎‎⁢⁡⁠⁢⁣⁡‏⁠‎⁢⁡⁠⁢⁣⁢‏⁠‎⁢⁡⁠⁢⁣⁣‏⁠‎⁢⁡⁠⁢⁣⁤‏⁠‎⁢⁡⁠⁢⁤⁡‏⁠‎⁢⁡⁠⁢⁤⁢‏⁠‎⁢⁡⁠⁢⁤⁣‏⁠‎⁢⁡⁠⁢⁤⁤‏⁠‎⁢⁡⁠⁣⁡⁡‏⁠‎⁢⁡⁠⁣⁡⁢‏⁠‎⁢⁡⁠⁣⁡⁣‏‏​⁡⁠⁡‌⁣⁢​‎‎⁢⁢⁠⁡‏⁠‎⁢⁢⁠⁢‏⁠‎⁢⁢⁠⁣‏⁠‎⁢⁢⁠⁤‏⁠‎⁢⁢⁠⁢⁡‏⁠‎⁢⁢⁠⁢⁢‏⁠‎⁢⁢⁠⁢⁣‏⁠‎⁢⁢⁠⁢⁤‏⁠‎⁢⁢⁠⁣⁡‏⁠⁠⁠⁠⁠⁠⁠‎⁢⁢⁠⁤⁤‏⁠⁠⁠⁠⁠⁠⁠⁠‏​⁡⁠⁡‌⁣⁣​‎⁠⁠‎⁢⁢⁠⁣⁢‏⁠‎⁢⁢⁠⁣⁣‏⁠‎⁢⁢⁠⁣⁤‏⁠‎⁢⁢⁠⁤⁡‏⁠‎⁢⁢⁠⁤⁢‏⁠‎⁢⁢⁠⁤⁣‏⁠‎⁢⁢⁠⁤⁤‏‏​⁡⁠⁡‌⁣⁤​‎‎⁢⁢⁠⁢⁡⁡‏⁠‎⁢⁢⁠⁢⁡⁢‏⁠‎⁢⁢⁠⁢⁡⁣‏⁠‎⁢⁢⁠⁢⁡⁤‏⁠‎⁢⁢⁠⁢⁢⁡‏⁠‎⁢⁢⁠⁢⁢⁢‏⁠‎⁢⁢⁠⁢⁢⁣‏⁠‎⁢⁢⁠⁢⁢⁤‏‏​⁡⁠⁡‌⁤⁡​‎‎⁢⁣⁠⁡‏⁠‎⁢⁣⁠⁢‏⁠‎⁢⁣⁠⁣‏⁠‎⁢⁣⁠⁤‏⁠‎⁢⁣⁠⁢⁡‏⁠‎⁢⁣⁠⁢⁢‏⁠‎⁢⁣⁠⁢⁣‏⁠‎⁢⁣⁠⁢⁤‏⁠‎⁢⁣⁠⁣⁡‏⁠‎⁢⁣⁠⁣⁢‏⁠‎⁢⁣⁠⁣⁣‏⁠‎⁢⁣⁠⁣⁤‏⁠‎⁢⁣⁠⁤⁡‏⁠‎⁢⁣⁠⁤⁢‏‏​⁡⁠⁡‌⁤⁢​‎‎⁢⁤⁠⁡‏⁠‎⁢⁤⁠⁢‏⁠‎⁢⁤⁠⁣‏⁠‎⁢⁤⁠⁤‏⁠‎⁢⁤⁠⁢⁡‏⁠‎⁢⁤⁠⁢⁢‏⁠‎⁢⁤⁠⁢⁣‏⁠‎⁢⁤⁠⁢⁤‏⁠‎⁢⁤⁠⁣⁡‏⁠‎⁢⁤⁠⁣⁢‏⁠‎⁢⁤⁠⁣⁣‏⁠‎⁢⁤⁠⁣⁤‏⁠‎⁢⁤⁠⁤⁡‏⁠‎⁢⁤⁠⁤⁢‏⁠‎⁢⁤⁠⁤⁣‏⁠‎⁢⁤⁠⁤⁤‏⁠‎⁢⁤⁠⁢⁡⁡‏⁠‎⁢⁤⁠⁢⁡⁢‏⁠‎⁢⁤⁠⁢⁡⁣‏⁠‎⁢⁤⁠⁢⁡⁤‏⁠‎⁢⁤⁠⁢⁢⁡‏‏​⁡⁠⁡‌⁤⁣​‎‎⁣⁡⁠⁡‏⁠‎⁣⁡⁠⁢‏⁠‎⁣⁡⁠⁣‏⁠‎⁣⁡⁠⁤‏⁠‎⁣⁡⁠⁢⁡‏⁠‎⁣⁡⁠⁢⁢‏⁠‎⁣⁡⁠⁢⁣‏⁠‎⁣⁡⁠⁢⁤‏⁠‎⁣⁡⁠⁣⁡‏⁠‎⁣⁡⁠⁣⁢‏⁠‎⁣⁡⁠⁣⁣‏⁠‎⁣⁡⁠⁣⁤‏⁠‎⁣⁡⁠⁤⁡‏⁠‎⁣⁡⁠⁤⁢‏⁠‎⁣⁡⁠⁤⁣‏⁠‎⁣⁡⁠⁤⁤‏⁠‎⁣⁡⁠⁢⁡⁡‏⁠‎⁣⁡⁠⁢⁡⁢‏⁠‎⁣⁡⁠⁢⁡⁣‏⁠‎⁣⁡⁠⁢⁡⁤‏⁠‎⁣⁡⁠⁢⁢⁡‏⁠‎⁣⁡⁠⁢⁢⁢‏⁠‎⁣⁡⁠⁢⁢⁣‏⁠‎⁣⁡⁠⁢⁢⁤‏⁠‎⁣⁡⁠⁢⁣⁡‏‏​⁡⁠⁡‌­

"①②③④⑤⑥⑦"f                           # ‎⁡one byte built ins
"k0k1k2k3k①k②k③k④k⑤k⑥k⑧k⑦"²          # ‎⁢two byte constants
"17E18E19E"³                         # ‎⁣2 powers for 17,18,19
Wfµᴥ]:                               # ‎⁤all of those sorted by their value and copied
      "›‹⨥⨪dT½":                     # ‎⁢⁡+1, -1, +2, -2, double, triple, halve and copy
                2ℭ$fJ                # ‎⁢⁢get all combinations of length 2 and 1
                     X“J             # ‎⁢⁣cartesian product with the constants and joined to the list of constants unmodified. 
# ‎⁢⁤We now have a list of built in constants with one byte allowed for modification as well, however there are duplicates as n+1-1 is the same as n
                        ⑴ᴥ☷ƛµL⎋]⑴ᴥu  # ‎⁣⁡get the shortest combination for each number generated by at least one combo
ᏐUVk⑧Rƛ#c      ]                     # ‎⁣⁢all 3 byte compressed numbers between 65536/3 and 65536
         "dT"f+]                     # ‎⁣⁣doubled and tripled
                ʎᴥk⑧>]fJ             # ‎⁣⁤keep only the ones greater than 65536 and join those to the list of constants
⑦ k1Rƛ#c'²+]Ju                       # ‎⁤⁡all squares less than one million
ʎL5<]ʎᴥ#c⎂L<]ʎᴥk4<]Þ£                # ‎⁤⁢get rid of ones that are greater than 5 bytes long, save no bytes over raw compression, or are greater than one million and push the remaining ones to the register (now a stack!)
k4ƛÞw:ᴥn=:a[▲h|k1<[n|n#c]            # ‎⁤⁣if the number is in the list of generated constants, use that, otherwise if it is less than 1000 just print it, otherwise print it as a compressed number
💎

Created with the help of Luminespire.

\$\endgroup\$
3
  • \$\begingroup\$ I might add cubes later and save some more bytes but it would only be at most 100 or so \$\endgroup\$ Commented Sep 13 at 23:07
  • \$\begingroup\$ other improvements besides cubes include 3 power, 5 power, factorials, nth primes, and combinatorics numbers maybe? \$\endgroup\$ Commented Sep 14 at 5:40
  • \$\begingroup\$ I think this is it except for individual powers not included like 7:* for 7^7 \$\endgroup\$ Commented 2 days ago
5
\$\begingroup\$

JavaScript → JavaScript, 5,887,526 bytes

Method

We use the following patterns:

Pattern description Example
exponentiation 3**9 for 19683
left-shift 7<<14 for 114688
scientific notation 123e3 for 123000
scientific notation with single-digit offset 2e5-1 for 199999
scientific notation with division by 8 9e5/8 for 112500 (*)

(*) This is actually the one and only case where this pattern saves a byte.

Source

const expr = [];

for(let n = 0; n <= 1e6; n++) {
  expr[n] = n.toString();
}

for(let p = 2; p < 100; p++) {
  for(let q = 2; q < 20; q++) {
    tryExpression(`${p}**${q}`);
    tryExpression(`${p}<<${q}`);
  }
}

for(let p = 1; p < 1000; p++) {
  for(let q = 3; q < 10; q++) {
    tryExpression(`${p}e${q}`);
    tryExpression(`${p}e${q}/8`);

    for(let o = 1; o < 10; o++) {
      tryExpression(`${p}e${q}-${o}`);
      tryExpression(`${p}e${q}+${o}`);
    }
  }
}

console.log(`Score: ${expr.reduce((t, e) => t + e.length, 0)} bytes\n`);

expr.forEach((e, n) =>
  e != n.toString() && console.log(n.toString().padEnd(7) + ': ' + e)
);

function tryExpression(e) {
  const v = eval(e);
  if(v >= 0 && v <= 1e6 && v % 1 == 0 && e.length < expr[v].length) {
    expr[v] = e;
  }
}

Try it online!

(literal numbers in their standard decimal form are omitted in the output)

\$\endgroup\$
4
\$\begingroup\$

Python 3Charcoal, 4982032 bytes

Charcoal doesn't have a compressed number syntax, so mostly savings are achieved by taking the ordinal of a character: ℅d...℅~ (2 bytes), ℅✑...℅✲ (4 bytes) and ℅𘚢...℅󴈾 (5 bytes) all save a byte over a numeric literal. This saves over 900000 bytes.

Otherwise, the only option is to get creative with the predefined variables χ c 10 and φ f 1000, and the operators Incremented, Decremented, Doubled, Halved, Plus, × Times and Power. There are only a limited number of these so they have been hard-coded in the program. Where two numeric literals are exponentiated, this unfortunately then requires an extra ¦ as a separator, reducing the number of opportunities where this can be used.

Although not required here, for numbers over 1000000, except when there are special cases, taking ordinals is the most efficient up to 1114111, after which you need to switch to base 95 conversion. Note that if the base 95 encoding would start with - you need to prefix a space to get it to decode correctly, so for instance 11145874 would encode as ⍘,~~~γ but 11145875 would encode as ⍘ - γ although that still saves a byte.

There are also some extra special cases to consider for large values because string compression can sometimes result in short compressed strings if there are patterns in the number e.g. 1111111111 compresses to I”|↓⦄” or 1100110011 compresses to I”)∧⧴;”. (For random numbers string compression won't win until the numbers are at least 50 digits long; I haven't measured this value exactly.)

Note that although I have used numeric literals when there isn't a saving to be made, it's possible that you need two numeric literals in a row, so an alternative same-length representation for one of the literals could avoid a ¦ separator thus saving a byte. I haven't tried to show these alternatives here.

The program starts by printing the total length of all the literals (in Charcoal's code page). It does not show all 1000001 literals as this would not fit in TIO's output buffer, instead it collapses similar ranges e.g. 99900 is found in the 99001 ... 99998 ⁹⁹⁰⁰¹ ... ⁹⁹⁹⁹⁸ range indicating that it would be represented as ⁹⁹⁹⁰⁰.

d = "⁰¹²³⁴⁵⁶⁷⁸⁹"
a = [''.join(d[int(c)]for c in str(i))for i in range(1000001)]
for i in range(100, 127):
 a[i] = "℅" + chr(i)
for i in range(10000, 16512):
 a[i] = "℅" + chr(i)
for i in range(100000, 999999):
 a[i] = "℅" + chr(i)
for c in ";∧∨“⊞⊟➙⧴″‴&|↶↷⟲←↑→↓⎇‽↧↥⌊⌈↖↗↘↙⭆?⪫⪪℅◧◨⮌≡№⊙⸿⬤≔≕▷▶✂↨⍘✳↔≦≧ⅈⅉ⌕⊕⊖⊗⊘⎚₂﹪⁺⁻⁰⁴⁵⁶⁷⁸⁹‖‹⁼›ABCDEFGHIJKLMNOPQRSTUVWXYZ⟦∕⟧…⦃⦄~�":
 a[ord(c)] = "℅´" + c
 if len(a[ord(c) - 1]) > 4:
  a[ord(c) - 1] = "⊖℅´" + c
 if len(a[ord(c) + 1]) > 4:
  a[ord(c) + 1] = "⊕℅´" + c
a[10] = "χ"
a[500] = "⊘φ"
a[999] = "⊖φ"
a[1000] = "φ"
a[1001] = "⊕φ"
for i in range(2, 11):
 a[1000 + i] = "⁺φ" + a[i]
 a[1000 * i] = "×φ" + a[i]
a[1024] = "X²χ"
a[1998] = "⊗⊖φ"
a[1999] = "⊖⊗φ"
a[2000] = "⊗φ"
a[2001] = "⊕⊗φ"
a[2002] = "⊗⊕φ"
a[16807] = "X⁷¦⁵"
for i in range(17, 127):
 a[i * 1000] = "×φ" + a[i]
a[19683] = "X³¦⁹"
a[32768] = "X⁸¦⁵"
a[46656] = "X⁶¦⁶"
a[59048] = "⊖X³χ"
a[59049] = "X³χ"
a[59050] = "⊕X³χ"
a[65536] = "X⁴¦⁸"
a[78125] = "X⁵¦⁷"
a[99999] = "⊖Xχ⁵"
a[100000] = "Xχ⁵"
a[100001] = "⊕Xχ⁵"
a[117649] = "X⁷¦⁶"
a[200000] = "⊗Xχ⁵"
a[262144] = "X⁴¦⁹"
a[279936] = "X⁶¦⁷"
a[390625] = "X⁵¦⁸"
a[500000] = "×φ⊘φ"
a[531441] = "X⁹¦⁶"
a[823543] = "X⁷¦⁷"
a[999000] = "×φ⊖φ"
a[999999] = "⊖×φφ"
a[1000000] = "×φφ"
print(sum(4+(s[1]>"䁿")if s[0] == "℅" and s[1] > "´" else len(s)for s in a))
b = 0
c = "⁰"
for i in range(1000001):
 if a[i][:2] == c:
  continue
 if len(a[i]) == 2 and a[i][0] == "℅" and c == "℅":
  continue
 if all(c in d for c in a[i]) and c == "⁰":
  continue
 if c:
  if b < i - 1:
   print(b, "...", i - 1, a[b], "...", a[i - 1])
  else:
   print(b, a[b])
 b = i
 if a[i][:2] == "℅´" or a[i][:2] == "⁺φ":
  c = a[i][:2]
 elif len(a[i]) == 2 and a[i][0] == "℅":
  c = "℅"
 elif all(c in d for c in a[i]):
  c = "⁰"
 else:
  c = ""
  print(i, a[i])

Try it online!

\$\endgroup\$
1
  • \$\begingroup\$ Wow! Almost makes me wish I went a little higher, glad you added the note about base 95 for larger numbers. \$\endgroup\$ Commented yesterday
2
\$\begingroup\$

This is a trivial example which generates, e.g., "101" for 101. This is valid in many languages besides C, of course. For most languages, this is the baseline which you aim to improve.

Python → C (5,888,897 bytes)

for i in range(1000001):
    print(i)
\$\endgroup\$
2
\$\begingroup\$

Python → Brainfuck (500,000,500,000 bytes)

for i in range(1000001):
    print('+' * i)

This assumes an idealized version of Brainfuck which allows constants greater than 255 in a cell. (Without this, Brainfuck is excluded as unable to store large enough numbers, though you could consider a version storing the numbers in multiple cells.)

This can be greatly improved and should not be remotely competitive.

\$\endgroup\$
2
\$\begingroup\$

05AB1E → 05AB1E, 3,934,943 bytes

6°Ýε
 "T₂₆₃т₅₁₄"©.V)ćkDdi®sèDˆ
 ëyg3‹iyDˆ
 ëy356‹iyт>-₅B"Ƶÿ"Dˆ
 ë"žAžBžCžDžEžFžGžH"©.V)ykDi®2ôsèDˆ
 ëy₄α3‹i"Í< >Ì"yÌè"₄"ìDˆ
 ëyg4‹iyëy₅B"•ÿ"}
  Dg¯yCè©g>›yTмõQ*i®'b«}
  Dg¯ytïè©g>›ytDïQ*i®'n«}
  Dg¯yT.nïè©g>›yT.nDïQ*i®"°"«}
  Dg¯y.Øè©g>›yp*i®"Ø"«}
  Dg¯y;è©g>›yÈ*i®"·"«}
  Dg¯yRè©g>›y›*yθĀ*i®'R«}
  Dg¯y2äнè©g>›yÂQ*i®"ûº"yÈè«}
  Dˆ

(Don't) try it online (times out after 60 seconds outputting roughly 4,500 values).

Explanation:

6°Ýε       # Push a list in the range [0,1000000] and map over it:
"T₂₆₃т₅₁₄"©.V)ćkDdi®sèDˆ
           #  If there is a single-byte constant for the value: use it
           #  ([T,₂,₆,₃,т,₅,₁,₄] = [10,26,36,95,100,255,256,1000])
ëyg3‹iyDˆ  #  Else-if the current value is just 1 or 2 digits: use it as is
ëy356‹iyт>-₅B"Ƶÿ"Dˆ
           #  Else-if the value is in the range [101,355]: use 2-bytes compression
ë"žAžBžCžDžEžFžGžH"©.V)ykDi®2ôsèDˆ
           #  Else-if there is a two-bytes constant for the value: use it
           #  ([žA,žB,žC,žD,žE,žF,žG,žH] = [512,1024,2048,4096,8192,16384,32768,65536])
ëy₄α3‹i"Í< >Ì"yÌè"₄"ìDˆ
           #  Else-if the value is in the range [998,1002]: use ₄ + de-/incrementer
ë          #  Else:
 yg4‹iy    #   If the value has 3 digits: use it as is
 ëy₅B"•ÿ"} #   Else: compress it as 3 or 4 bytes
 Dg¯yCè©g>›yTмõQ*i®'b«}
          '#   If the value is binary, and can be shortened with an earlier value + "b": use that instead
 Dg¯ytïè©g>›ytDïQ*i®'n«}
          '#  If the value is a square number, and can be shortened with an earlier value + "n": use that instead
 Dg¯y2.nïè©g>›y2.nDïQ*i®'o«}
          '#  If the value is a power of 2, and can be shortened with an earlier value + "o": use that instead
 Dg¯yT.nïè©g>›yT.nDïQ*i®"°"«}
           #  If the value is a power of 10, and can be shortened with an earlier value + "°": use that instead
 Dg¯y.Øè©g>›yp*i®"Ø"«}
           #  If the value is a prime, and can be shortened with an earlier value + "Ø": use that instead
 Dg¯y;è©g>›yÈ*i®"·"«}
           #  If the value is even, and can be shortened with an earlier value + "·": use that instead
 Dg¯yRè©g>›y›*yθĀ*i®'R«}
          '#  If the value does not end with a 0, is smaller reversed, and can be shortened with an earlier value + "R": use that instead
 Dg¯y2äнè©g>›yÂQ*i®"ûº"yÈè«}
           #  If the value is a palindrome, and can be shortened with an earlier value + "û" (odd length) or "º" (even length): use that instead
           #  Else: use the 3/4-bytes compression
 Dˆ        #  Add a copy to the global array,
           #  which we'll potentially use for the later checks if something can be shorter

Which results in:

  • 1 byte each for (18 in total):
    • Single digit numbers in the range [0,9]
    • [10,26,36,95,100,255,256,1000] with 1-byte constants [T,₂,₆,₃,т,₅,₁,₄]
  • 2 bytes each for (291 in total):
    • 2-digit numbers in the range [10,99], except for the earlier mentioned 1-byte constants
    • [512,1024,2048,4096,8192,16384,32768,65536] with 2-bytes constants [žA,žB,žC,žD,žE,žF,žG,žH]
    • 2-bytes compressed integers in the range [101,355], except for the earlier mentioned 1-byte constants
    • 998,999,1001,1002 using 1-byte constant 1000 (), decremented once (<) for 999 or twice (Í) for 998, or incremented once > for 1001 or twice (Ì) for 1002
    • 1-byte digits/constants in combination with 1-byte builtins (larger than 355, and minus earlier found values):
      • ₄Í; ₄<; ₄>; ₄Ì for 998,999,1001,1002 respectively
      • Tb; ₂b; ₆b for 1010,11010,100100 respectively
      • ₂n; ₆n; ₃n; тn; ₅n; ₄n for 676,1296,9025,10000,65025,1000000 respectively
      • ; for 10000,100000 respectively
      • ₃Ø; тØ; ₅Ø; ₁Ø; ₄Ø for 503,547,1619,1621,7927 respectively
      • ₅·; ₄· for 510,2000 respectively
      • ₅R; ₁R for 552,652 respectively
      • ₃û; тû; ₅û; ₁û for 959,10001,25552,25652 respectively
      • ₆º; ₃º; тº; ₅º; ₁º; ₄º for 2662,3663,9559,100001,255552,256652 respectively
  • 3 bytes each for (64425 in total):
    • 3-digit numbers in the range [100,999], except for the earlier mentioned 1- and 2-bytes numbers
    • 3-bytes compressed integers in the range [1000,65024], except for the earlier mentioned 1- and 2-bytes numbers
    • earlier 2-bytes numbers in combination with 1-byte builtins (larger than 65024, and minus earlier found values):
  • 4 bytes each for (935267 in total):
    • 4-bytes compressed integers in the range [65025,1000000], except for the earlier mentioned 1-, 2-, and 3-bytes numbers

Resulting in a total of \$18 + 291\times2 + 64425\times3 + 935267\times4 = 3934943\$ bytes.

\$\endgroup\$
3
  • \$\begingroup\$ so the byte savings over vyxal come from the compressed 3 char strings being treated as numbers? thus saving a byte on over eight hundred thousand numbers? \$\endgroup\$ Commented yesterday
  • 1
    \$\begingroup\$ @pacman256 I guess yeah. 05AB1E can compress numbers in the range \$[101,355]\$ in 2 bytes each using Ƶ., and in the range \$[356,65024]\$ in 3 bytes each using Ž.. (although this second one I haven't used, since the regular compression for \$\geq65025\$ with •...• won't need the trailing if we're only outputting the number, so Ž.. and •.. are the same length (and 1M would be •F"‘ compressed, so a maximum of 4 bytes in the range of \$[0,1000000]\$ for this challenge). \$\endgroup\$ Commented yesterday
  • 1
    \$\begingroup\$ okay yeah vyxal does not have any four byte compressed number since you can't drop the trailing one and the compression automatically uses two byte number symbol to compress ≥65535, it jumps right from 3 to five \$\endgroup\$ Commented 16 hours ago
2
\$\begingroup\$

C++ → C++, 5,887,608 5,887,607 bytes

We use scientific notation for multiples of powers of ten¹ (and for small offsets from those) and shift-left operator for multiples of powers of two.

#include <algorithm>
#include <concepts>
#include <bit>
#include <format>
#include <string>
#include <vector>

template<std::unsigned_integral T>
std::string shortest_repr(T i)
{
    std::vector<std::string> candidates{std::to_string(i)};
    if (i) {
        // can we use scientific notation?
        for (T e = 3u, m = 1000u;  i % m == 0;  ++e, m *= 10) {
            candidates.push_back(std::format("{}e{}", i/m, e));
        }
        // scientific notation with a small offset
        for (T e = 5u, m = 100'000u, a = 10u;  m < i+a;  ++e, m *= 10, a *= 10) {
            if (i % m < a) {
                candidates.push_back(std::format("{}e{}+{}", i/m, e, i % m));
            } else if (m - i%m < a) {
                candidates.push_back(std::format("{}e{}-{}", (i+a)/m, e, m - i%m));
            }
        }
        // ⅛ of large number
        for (T m = 8u;  m * i % 100'000 == 0;  m *= 2) {
            candidates.push_back(std::format("{}/{}", shortest_repr(m*i), m));
        }
        // bit shift
        if (T z = std::countr_zero(i);  z > 10) {
            candidates.push_back(std::format("{}<<{}", shortest_repr(i >> z), z));
        }
    }
    return *std::ranges::min_element(candidates, {}, &std::string::size);
}

With this function, we can create a view by transforming suitable input. So to write one item per line to standard output, we have

#include <iostream>
#include <ranges>

int main()
{
    auto const strings
        = std::views::iota(0u, 1'000'001u)
        | std::views::transform(shortest_repr<unsigned>);

    std::ranges::copy(strings, std::ostream_iterator<std::string>(std::cout, "\n"));
}

Similarly, to count the output size:

    std::locale::global(std::locale(""));
    std::println("Total: {:L}", std::ranges::fold_left(strings
                                    | std::views::transform(&std::string::size),
                                0uz, std::plus<>{}));

If we wish to do both in the same program, I recommend materialising into a collection. This one prints the values that are shorter than just writing the literal number, as well as the total size of all the strings:

#include <cmath>
#include <iostream>
#include <iterator>
#include <print>
#include <ranges>

int main()
{
    auto const strings
        = std::views::iota(0u, 1'000'001u)
        | std::views::transform(shortest_repr<unsigned>)
        | std::ranges::to<std::vector>();

    std::ranges::copy(strings | std::views::enumerate
                      | std::views::filter([](auto e){ auto& [n,s] = e; return n && s.size() <= std::log10(n);})
                      | std::views::values,
                      std::ostream_iterator<std::string>(std::cout, "\n"));

    std::locale::global(std::locale(""));
    std::println("Total: {:L}", std::ranges::fold_left(strings | std::views::transform(&std::string::size), 0uz, std::plus<>{}));
}

Selected output:

⋮
1e5+7
1e5+8
1e5+9
101e3
102e3
103e3
104e3
105e3
106e3
107e3
108e3
109e3
11e4
111e3
112e3
9e5/8
113e3
114e3
7<<14
115e3
116e3
⋮
Total: 5,887,607

¹ Although scientific notation produces a floating-point type, this will be exact for the small numbers used in this question and will therefore result in exactly-correct values when parsed in integer context.

\$\endgroup\$
3
  • \$\begingroup\$ Thanks @heapunderrun - I think I pasted in an early version and didn't correct it when I fixed to use string rather than char. \$\endgroup\$ Commented yesterday
  • \$\begingroup\$ Yes, the latter - the final program is correct here. \$\endgroup\$ Commented 23 hours ago
  • \$\begingroup\$ Should all be correct now @heapunderrun - if so, we can delete these comments. \$\endgroup\$ Commented 21 hours ago
1
\$\begingroup\$

Python 3!@#$%^&*()_+, 3932313 bytes

for i in range(1000001):
	# print(str(i) + ":", end = "") # This is used for indicating which number to print for the code snippet
	if chr(i) in "@$%&*)_+":
		print(chr(i//2)+chr((i//2) + (i%2)) + "+", end = "")
	elif chr(i) in "!#(?^":
		print(chr(i-1)+"^", end = "")
	else:
		print(chr(i), end = "")
	# print("#") # to print the number, uncomment this line.

Try it online!

There are 128 1-byte characters, 1920 2-byte characters, 61440 3-byte characters, and at least one million 4-byte characters. There are 8 special numbers that take two more bytes to print, 4 special numbers that take one more byte to print. So by considering these things, if I'm right, the byte count is:

$$8×3+5×2+(128−8−5)×1+1920×2+63488×3+(1000001−128−1920−63488)×4$$

Which is 3932313, and is pretty cool but also a bit boring.

\$\endgroup\$
3
  • \$\begingroup\$ You don't need the #, you just need to create the number in whatever manner is appropriate to your language (putting it on the stack in your case). I don't get your total, though. I get 8×3 + 5×2 + (128-8-5)×1 + 1920×2 + 63488×3 + 934465×4 = 3,932,313. \$\endgroup\$ Commented yesterday
  • 1
    \$\begingroup\$ @Charles I think I fixed the total. \$\endgroup\$ Commented yesterday
  • \$\begingroup\$ It's possible there's a better encoding than UTF-8 for this purpose, but I don't know of one. It's certainly a good score even as is. \$\endgroup\$ Commented yesterday
1
\$\begingroup\$

Excel VBA → Excel, 5888896 5888673

OP clarified that the snippets need to resolve to the correct value without relying on the code that generates the snippets placing them in the correct place. With that in mind, I have an even less clever solution that nonetheless uses less bytes in the output.

Sub MillionAndOneConstants()
    
    ' Generate all the values from 0 to 1,000,000
    Range("A1").Formula2 = "=SEQUENCE(1000000,,0)"
    Application.Calculate
    If Not Application.CalculationState = xlDone Then
        DoEvents
    End If
    Range("A:A").Copy
    Range("A:A").PasteSpecial xlPasteValues
    
    ' Replace values with an equivalent exponent form if that is less bytes
    Range("A1025") = "4^5"
    Range("A1297") = "6^4"
    Range("A2188") = "3^7"
    Range("A2402") = "7^4"
    Range("A3126") = "5^5"
    Range("A4097") = "4^6"
    Range("A6562") = "3^8"
    Range("A7777") = "6^5"
    Range("A10001") = "10^4"
    Range("A10649") = "22^3"
    Range("A12168") = "23^3"
    Range("A13825") = "24^3"
    Range("A14642") = "11^4"
    Range("A15626") = "5^6"
    Range("A16385") = "4^7"
    Range("A16808") = "7^5"
    Range("A17577") = "26^3"
    Range("A19684") = "3^9"
    Range("A20737") = "12^4"
    Range("A21953") = "28^3"
    Range("A24390") = "29^3"
    Range("A27001") = "30^3"
    Range("A28562") = "13^4"
    Range("A29792") = "31^3"
    Range("A32769") = "8^5"
    Range("A35938") = "33^3"
    Range("A38417") = "14^4"
    Range("A39305") = "34^3"
    Range("A42876") = "35^3"
    Range("A46657") = "6^6"
    Range("A50626") = "15^4"
    Range("A50654") = "37^3"
    Range("A54873") = "38^3"
    Range("A59050") = "9^5"
    Range("A59320") = "39^3"
    Range("A64001") = "40^3"
    Range("A65537") = "4^8"
    Range("A68922") = "41^3"
    Range("A74089") = "42^3"
    Range("A78126") = "5^7"
    Range("A79508") = "43^3"
    Range("A83522") = "17^4"
    Range("A85185") = "44^3"
    Range("A91126") = "45^3"
    Range("A97337") = "46^3"
    Range("A100001") = "10^5"
    Range("A100490") = "317^2"
    Range("A101125") = "318^2"
    Range("A101762") = "319^2"
    Range("A102401") = "320^2"
    Range("A103042") = "321^2"
    Range("A103685") = "322^2"
    Range("A103824") = "47^3"
    Range("A104330") = "323^2"
    Range("A104977") = "18^4"
    Range("A110593") = "48^3"
    Range("A117650") = "7^6"
    Range("A125001") = "50^3"
    Range("A130322") = "19^4"
    Range("A131073") = "2^17"
    Range("A132652") = "51^3"
    Range("A140609") = "52^3"
    Range("A148878") = "53^3"
    Range("A157465") = "54^3"
    Range("A160001") = "20^4"
    Range("A161052") = "11^5"
    Range("A166376") = "55^3"
    Range("A175617") = "56^3"
    Range("A177148") = "3^11"
    Range("A185194") = "57^3"
    Range("A194482") = "21^4"
    Range("A195113") = "58^3"
    Range("A205380") = "59^3"
    Range("A216001") = "60^3"
    Range("A226982") = "61^3"
    Range("A234257") = "22^4"
    Range("A238329") = "62^3"
    Range("A248833") = "12^5"
    Range("A250048") = "63^3"
    Range("A262145") = "4^9"
    Range("A274626") = "65^3"
    Range("A279842") = "23^4"
    Range("A279937") = "6^7"
    Range("A287497") = "66^3"
    Range("A300764") = "67^3"
    Range("A314433") = "68^3"
    Range("A328510") = "69^3"
    Range("A331777") = "24^4"
    Range("A343001") = "70^3"
    Range("A357912") = "71^3"
    Range("A371294") = "13^5"
    Range("A373249") = "72^3"
    Range("A389018") = "73^3"
    Range("A390626") = "5^8"
    Range("A405225") = "74^3"
    Range("A421876") = "75^3"
    Range("A438977") = "76^3"
    Range("A456534") = "77^3"
    Range("A456977") = "26^4"
    Range("A474553") = "78^3"
    Range("A493040") = "79^3"
    Range("A512001") = "80^3"
    Range("A524289") = "2^19"
    Range("A531442") = "9^6"
    Range("A537825") = "14^5"
    Range("A551369") = "82^3"
    Range("A571788") = "83^3"
    Range("A592705") = "84^3"
    Range("A614126") = "85^3"
    Range("A614657") = "28^4"
    Range("A636057") = "86^3"
    Range("A658504") = "87^3"
    Range("A681473") = "88^3"
    Range("A704970") = "89^3"
    Range("A707282") = "29^4"
    Range("A729001") = "90^3"
    Range("A753572") = "91^3"
    Range("A759376") = "15^5"
    Range("A778689") = "92^3"
    Range("A804358") = "93^3"
    Range("A810001") = "30^4"
    Range("A823544") = "7^7"
    Range("A830585") = "94^3"
    Range("A857376") = "95^3"
    Range("A884737") = "96^3"
    Range("A912674") = "97^3"
    Range("A923522") = "31^4"
    Range("A941193") = "98^3"
    Range("A970300") = "99^3"
    Range("A1000001") = "10^6"
    
End Sub

It fills in regular integers for all the values and then replaces certain values with a exponential representation if it's shorter. This list of values could also be generated by the program but, in this case, I worked them out separately but it was easier than writing and debugging the VBA solution.

\$\endgroup\$
2
  • \$\begingroup\$ This is creative, but not quite what I intended. If you're writing an Excel formula and you want to use 1000000, you can't rely on being on the millionth row. But you can write 10^6 which is shorter. Since you're writing a snippet of a formula rather than a formula, you don't need the = in your count. So Excel can do a lot better than 5,888,896. \$\endgroup\$ Commented yesterday
  • 1
    \$\begingroup\$ @Charles The result is now 0.0038% shorter. \$\endgroup\$ Commented yesterday

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.