Answer the question
In order to leave comments, you need to log in
[closed]Insert 1,000,000 unique random values
It is necessary to insert into the database 1,000,000 unique generated words 4 characters long. Russian alphabet. The implementation language is php. I did it with In_array, but the processing speed did not please me. Help with an idea please?
upd: solution found
Answer the question
In order to leave comments, you need to log in
solved the issue by creating an additional table, then the method with UNIQUE KEY is used. After the experiment, the additional table is completely cleared. Thanks to all:)
as an option, you throw all the words into an array, but as keys. Count the number of elements in the array - and add n missing words - check again. And so on, until the number of elements is = 1000000.
Then you throw all the keys into the database.
I don’t know how much faster it will be, but this is the first thing that came to mind)
You can simply generate 1,000,000 unique numbers (10th number system), and then translate the result into a quaternary number system, and replace {0,1,2,3} with your alphabet.
You can generate it using the base: something like
and write UNIQUE KEY `value` (`uniq_value`) for the table
while ($n<1000000)
{
if(mysql_query("INSERT INTO keys (value) VALUES ("
. rand_val() . ")")))
{
$n++;
}
}
If I understand everything correctly, then you can generate all possible values, write them to an array, do a shuffle and take only the first million.
<?
$words=array();
$w='aaaa';
for($i1=0;$i1<27;$i1++)
for($i2=0;$i2<27;$i2++)
for($i3=0;$i3<27;$i3++)
for($i3= 0;$i3<27;$i3++)
{
$w[0]=chr(97+$i1);
$w[1]=chr(97+$i2);
$w[2]=chr(97+$i3);
$w[3]=chr(97+$i4);
$words[]=$w;
}
shuffle($words);
?>
Here is a variant of the solution by recursive enumeration in a given alphabet. All parameters are set (number of characters, number of words and, in fact, the alphabet itself-character set). Just checked, it works.
<?
function generate($depth = 0, $str = "")
{
global $ABC, $lettersCount, $c, $wordsCount;
if($depth == $lettersCount)
{
mysql_query("INSERT INTO `table` SET `value`='$str'");
$c++;
return;
}
for($i = 0; $i < strlen($ABC); $i++)
{
generate($depth + 1, $str.$ABC[$i]);
if($c == $wordsCount)
{
echo "Сгенерировано $c слов";
exit;
}
}
}
$ABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
$lettersCount = 4; //кол-во символов
$wordsCount = 1000000; //кол-во слов
$c = 0;
generate();
?>
Alternatively, you can render an array with 1000000 keys with one value "" the
keys will consist of 4 digits, if you need a value, just take chr from the values. searching for a value by key will be faster than searching by value, and, it seems to me, much faster + working with numbers will be faster than working with strings. Show lines only as needed.
If we take only the lower case, then it turns out 33 ^ 4 = 1185921
I would generate all the options, and randomly delete the extra ones, up to 1000000.
And if the upper one is also needed, then I would go through the resulting sample and also randomly make the conversion to the upper one. And no uniqueness checks are needed.
But it's not at all clear why you need to write this in php, I would do it directly in mysql:
storage:
But in general, it's not fast either. Maybe it makes sense to generate dynamic sql and insert it in batches.
Well, then it remains only to remove the extra ones:
BEGIN
SET @str = 'абвгдежзийклмнопрстуфхцчшщъыьэюя';
SET @a = 1;
WHILE @a < 33 DO
SET @b = 1;
WHILE @b < 33 DO
SET @c = 1;
WHILE @c < 33 DO
SET @d = 1;
WHILE @d < 33 DO
-- тут можно добавить случайное преобразование в верхний регистр
INSERT INTO test (word) VALUES(concat(
substr(@str, @a, 1),
substr(@str, @b, 1),
substr(@str, @c, 1),
substr(@str, @d, 1)));
SET @d = @d + 1;
END WHILE;
SET @c = @c + 1;
END WHILE;
SET @b = @b + 1;
END WHILE;
SET @a = @a + 1;
END WHILE;
DELETE t FROM test AS t
INNER JOIN (
SELECT word FROM test
ORDER BY RAND()
LIMIT 0, 48576
) AS q
ON q.word = t.word;
as for working with the database (you have MySQL for sure?)
ALTER TABLE `something` DISABLE KEYS
several multi-row INSERTs (or one with all records at once, if you don’t mind the memory), perhaps even with DELAYED
ALTER TABLE `something` ENABLE KEYS
Are word comparisons case sensitive?
If not, then, according to the condition, the Russian alphabet:
33 letters, 4 symbols - 1,185,921 values, i.e. almost everything is needed.
It is possible to generate 185 921 random values that you do not take, but pour all the sotals
Or maybe like this?
<?
$ABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
$lettersCount = 4; //кол-во символов
$wordsCount = 1000000; //кол-во слов
$c = 0;
$result = array();
while(count($result) < $wordsCount && $c++ < 10)
{
$str = "";
for($i = 0; $i < $lettersCount; $i++)
$str .= $ABC[rand(0, strlen($ABC) - 1)];
$result[$str] = 1;
}
echo "Сгенерировано ".count($result)." строк";
?>
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question