-
Notifications
You must be signed in to change notification settings - Fork 1
/
BitVector.py
3659 lines (3058 loc) · 169 KB
/
BitVector.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
__version__ = '3.4.7'
__author__ = "Avinash Kak ([email protected])"
__date__ = '2017-Janurary-26'
__url__ = 'https://engineering.purdue.edu/kak/dist/BitVector-3.4.7.html'
__copyright__ = "(C) 2017 Avinash Kak. Python Software Foundation."
__doc__ = '''
BitVector.py
Version: ''' + __version__ + '''
Date: ''' + __date__ + '''
@title
CHANGES IN THIS VERSION:
Version 3.4.7 fixes a Python 3 specific bug in the write_to_file()
method of the module. While I was at it, I have also changed the name
of the method write_bits_to_fileobject() to
write_bits_to_stream_object() so that there is no confusion between
write_to_file() and write_bits_to_fileobject(). For backward
compatibility, write_bits_to_fileobject() will continue to work if you
are writing a bitvector to a string stream object.
Version 3.4.6 fixes what was hopefully the last remaining bug in using
negative index values for slice assignments.
See the "CHANGE HISTORY" section for a complete history of all the
changes made in the different versions of this module.
@title
INTRODUCTION:
The BitVector class is for a memory-efficient packed representation of
bit arrays and for logical operations on such arrays. The operations
supported on bit vectors are:
__add__ for concatenation
__and__ for bitwise logical AND
__contains__
__eq__, __ne__, __lt__, __le__, __gt__, __ge__
__getitem__ for indexed and sliced access
__int__ for returning integer value
__invert__ for inverting the 1's and 0's
__iter__ for iterating through
__len__ for len()
__lshift__ for circular shifts to the left
__or__ for bitwise logical OR
__rshift__ for circular shifts to the right
__setitem__ for indexed and sliced setting
__str__ for str()
__xor__ for bitwise logical XOR
close_file_object
count_bits
count_bits_sparse faster for sparse bit vectors
deep_copy
divide_into_two
gcd for greatest common divisor
gen_random_bits
get_bitvector_in_ascii
get_bitvector_in_hex
gf_divide_by_modulus for modular divisions in GF(2^n)
gf_MI for multiplicative inverse in GF(2^n)
gf_multiply for multiplications in GF(2)
gf_multiply_modular for multiplications in GF(2^n)
hamming_distance
int_val for returning the integer value
is_power_of_2
is_power_of_2_sparse faster for sparse bit vectors
jaccard_distance
jaccard_similarity
length
min_canonical for min int value canonical form
multiplicative_inverse
next_set_bit
pad_from_left
pad_from_right
permute
rank_of_bit_set_at_index
read_bits_from_file
reset
reverse
runs
set_value
shift_left for non-circular left shift
shift_right for non-circular right shift
test_for_primality
unpermute
write_bits_to_stream_object
write_to_file
@title
CONSTRUCTING BIT VECTORS:
You can construct a bit vector in the following different ways:
@tagC0
(C0) You construct an EMPTY bit vector using the following syntax:
bv = BitVector(size = 0)
@tagC1
(C1) You can construct a bit vector directly from either a tuple or a
list of bits, as in
bv = BitVector(bitlist = [1,0,1,0,0,1,0,1,0,0,1,0,1,0,0,1])
@tagC2
(C2) You can construct a bit vector from an integer by
bv = BitVector(intVal = 56789)
The bits stored now will correspond to the binary representation
of the integer. The resulting bit vector is the shortest
possible bit vector for the integer value supplied. For example,
when intVal is 0, the bit vector constructed will consist of just
the bit 0.
@tagC3
(C3) When initializing a bit vector with an intVal as shown above, you
can also specify a size for the bit vector:
bv = BitVector(intVal = 0, size = 8)
will return the bit vector consisting of the bit pattern
00000000. The zero padding needed for meeting the size
requirement is always on the left. If the size supplied is
smaller than what it takes to create the shortest possible bit
vector for intVal, an exception is thrown.
@tagC4
(C4) You can create a zero-initialized bit vector of a given size by
bv = BitVector(size = 62)
This bit vector will hold exactly 62 bits, all initialized to
the 0 bit value.
@tagC5
(C5) You can construct a bit vector from a disk file by a two-step
procedure. First you construct an instance of bit vector by
bv = BitVector(filename = 'somefile')
This bit vector itself is incapable of holding the bits. To now
create bit vectors that actually hold the bits, you need to make
the following sort of a call on the above variable bv:
bv1 = bv.read_bits_from_file(64)
bv1 will be a regular bit vector containing 64 bits from the disk
file. If you want to re-read a file from the beginning for some
reason, you must obviously first close the file object that was
acquired with a call to the BitVector constructor with a filename
argument. This can be accomplished by
bv.close_file_object()
@tagC6
(C6) You can construct a bit vector from a string of 1's and 0's by
bv = BitVector(bitstring = '110011110000')
@tagC7
(C7) Yet another way to construct a bit vector is to read the bits
directly from a file-like object, as in
import io
x = "111100001111"
fp_read = io.StringIO( x )
bv = BitVector(fp = fp_read)
print(bv) # 111100001111
@tagC8
(C8) You can also construct a bit vector directly from a text string
as shown by the example:
bv3 = BitVector(textstring = "hello")
print(bv3) # 0110100001100101011011000110110001101111
mytext = bv3.get_bitvector_in_ascii()
print mytext # hello
The bit vector is constructed by using the one-byte ASCII
encoding of the characters in the text string.
@tagC9
(C9) You can also construct a bit vector directly from a string of hex
digits as shown by the example:
bv4 = BitVector(hexstring = "68656c6c6f")
print(bv4) # 0110100001100101011011000110110001101111
myhexstring = bv4.get_bitvector_in_hex()
print myhexstring # 68656c6c6
@tagC10
(C10) You can also construct a bit vector directly from a bytes type
object you previously created in your script. This can be useful
when you are trying to recover the integer parameters stored in
public and private keys. A typical usage scenario:
keydata = base64.b64decode(open(sys.argv[1]).read().split(None)[1])
bv = BitVector.BitVector(rawbytes = keydata)
where sys.argv[1] is meant to supply the name of a public key
file (in this case an SSH RSA public key file).
@title
OPERATIONS SUPPORTED BY THE BITVECTOR CLASS:
@title
DISPLAYING BIT VECTORS:
@tag1
(1) Since the BitVector class implements the __str__ method, a bit
vector can be displayed on a terminal by
print(bitvec)
or, for only Python 2.x, by
print bitvec
Basically, you can always obtain the string representation of a
bit vector by
str(bitvec)
and integer value by
int(bitvec)
@title
ACCESSING AND SETTING INDIVIDUAL BITS AND SLICES:
@tag2
(2) Any single bit of a bit vector bv can be set to 1 or 0 by
bv[M] = 1_or_0
print( bv[M] )
or, for just Python 2.x, by
bv[M] = 1_or_0
print bv[M]
for accessing (and setting) the bit at the position that is indexed
M. You can retrieve the bit at position M by bv[M]. Note that the
index 0 corresponds to the first bit at the left end of a bit
pattern. This is made possible by the implementation of the
__getitem__ and __setitem__ methods.
@tag3
(3) A slice of a bit vector obtained by
bv[i:j]
is a bit vector constructed from the bits at index positions from i
through j-1. This is made possible by the implementation of the
__getitem__ method.
@tag4
(4) You can also carry out slice assignment:
bv1 = BitVector(size = 25)
bv2 = BitVector(bitstring = '1010001')
bv1[6:9] = bv2[0:3]
bv3 = BitVector(bitstring = '101')
bv1[0:3] = bv3
The first slice assignment will set the 6th, 7th, and the 8th bits
of the bit vector bv1 according to the first three bits of bv2.
The second slice assignment will set the first three bits of bv1
according to the three bits in bv3. This is made possible by the
slice setting code in the __setitem__ method.
@tag5
(5) You can iterate over a bit vector, as illustrated by
for bit in bitvec:
print(bit)
This is made possible by the override definition for the special
__iter__() method.
@tag6
(6) Negative subscripts for array-like indexing are supported.
Therefore,
bitvec[-i]
is legal assuming that the index range is not violated. A negative
index carries the usual Python interpretation: The last element of
a bit vector is indexed -1 and the first element -(n+1) if n is the
total number of bits in the bit vector. Negative subscripts are
made possible by special-casing such access in the implementation
of the __getitem__ method (actually it is the _getbit method).
@tag7
(7) You can reset a previously constructed bit vector to either the
all-zeros state or the all-ones state by
bv1 = BitVector(size = 25)
...
...
bv1.reset(1)
...
...
bv1.reset(0)
The first call to reset() will set all the bits of bv1 to 1's and
the second call all the bits to 0's. What the method accomplishes
can be thought of as in-place resetting of the bits. The method
does not return anything.
@title
LOGICAL OPERATIONS ON BIT VECTORS:
@tag8
(8) Given two bit vectors bv1 and bv2, you can perform bitwise
logical operations on them by
result_bv = bv1 ^ bv2 # for bitwise XOR
result_bv = bv1 & bv2 # for bitwise AND
result_bv = bv1 | bv2 # for bitwise OR
result_bv = ~bv1 # for bitwise negation
These are made possible by implementing the __xor__, __and__,
__or__, and __invert__ methods, respectively.
@title
COMPARING BIT VECTORS:
@tag9
(9) Given two bit vectors bv1 and bv2, you can carry out the following
comparisons that return Boolean values:
bv1 == bv2
bv1 != bv2
bv1 < bv2
bv1 <= bv2
bv1 > bv2
bv1 >= bv2
The equalities and inequalities are determined by the integer
values associated with the bit vectors. These operator
overloadings are made possible by providing implementation code for
__eq__, __ne__, __lt__, __le__, __gt__, and __ge__, respectively.
@title
OTHER SUPPORTED OPERATIONS:
@tag10
(10) permute()
unpermute()
You can permute and unpermute bit vectors:
bv_permuted = bv.permute(permutation_list)
bv_unpermuted = bv.unpermute(permutation_list)
Both these methods return new bitvector objects. Permuting a
bitvector means that you select its bits in the sequence specified
by the argument permutation_list. Calling unpermute() with the same
argument permutation_list restores the sequence of bits to what it
was in the original bitvector.
@tag11
(11) circular shifts
Left and right circular rotations can be carried out by
bitvec << N
bitvec >> N
for circular rotations to the left and to the right by N bit
positions. These operator overloadings are made possible by
implementing the __lshift__ and __rshift__ methods, respectively.
Note that both these operators return the bitvector on which they
are invoked. This allows for a chained invocation of these two
operators.
@tag12
(12) shift_left()
shift_right()
If you want to shift in-place a bitvector non-circularly:
bitvec = BitVector(bitstring = '10010000')
bitvec.shift_left(3) # 10000000
bitvec.shift_right(3) # 00010000
As a bitvector is shifted non-circularly to the left or to the
right, the exposed bit positions at the opposite end are filled
with zeros. These two methods return the bitvector object on which
they are invoked. This is to allow for chained invocations of
these methods.
@tag13
(13) divide_into_two()
A bitvector containing an even number of bits can be divided into
two equal parts by
[left_half, right_half] = bitvec.divide_into_two()
where left_half and right_half hold references to the two returned
bitvectors. The method throws an exception if called on a
bitvector with an odd number of bits.
@tag14
(14) int_val()
You can find the integer value of a bitvector object by
bitvec.int_val()
or by
int(bitvec)
As you expect, a call to int_val() returns an integer value.
@tag15
(15) string representation
You can convert a bitvector into its string representation by
str(bitvec)
@tag16
(16) concatenation
Because __add__ is supplied, you can always join two bitvectors by
bitvec3 = bitvec1 + bitvec2
bitvec3 is a new bitvector object that contains all the bits of
bitvec1 followed by all the bits of bitvec2.
@tag17
(17) length()
You can find the length of a bitvector by
len = bitvec.length()
@tag18
(18) deep_copy()
You can make a deep copy of a bitvector by
bitvec_copy = bitvec.deep_copy()
Subsequently, any alterations to either of the bitvector objects
bitvec and bitvec_copy will not affect the other.
@tag19
(19) read_bits_from_file()
As mentioned earlier, you can construct bitvectors directly from
the bits in a disk file through the following calls. As you can
see, this requires two steps: First you make a call as illustrated
by the first statement below. The purpose of this call is to
create a file object that is associated with the variable bv.
Subsequent calls to read_bits_from_file(n) on this variable return
a bitvector for each block of n bits thus read. The
read_bits_from_file() throws an exception if the argument n is not
a multiple of 8.
bv = BitVector(filename = 'somefile')
bv1 = bv.read_bits_from_file(64)
bv2 = bv.read_bits_from_file(64)
...
...
bv.close_file_object()
When reading a file as shown above, you can test the attribute
more_to_read of the bitvector object in order to find out if there
is more to read in the file. The while loop shown below reads all
of a file in 64-bit blocks:
bv = BitVector( filename = 'testinput4.txt' )
print("Here are all the bits read from the file:")
while (bv.more_to_read):
bv_read = bv.read_bits_from_file( 64 )
print(bv_read)
bv.close_file_object()
The size of the last bitvector constructed from a file corresponds
to how many bytes remain unread in the file at that point. It is
your responsibility to zero-pad the last bitvector appropriately
if, say, you are doing block encryption of the whole file.
@tag20
(20) write_to_file()
You can write a bit vector directly to a file by calling
write_to_file(), as illustrated by the following example that reads
one bitvector from a file and then writes it to another file:
bv = BitVector(filename = 'input.txt')
bv1 = bv.read_bits_from_file(64)
print(bv1)
FILEOUT = open('output.bits', 'wb')
bv1.write_to_file(FILEOUT)
FILEOUT.close()
bv = BitVector(filename = 'output.bits')
bv2 = bv.read_bits_from_file(64)
print(bv2)
The method write_to_file() throws an exception if the size of the
bitvector on which the method is invoked is not a multiple of 8.
This method does not return anything.
IMPORTANT FOR WINDOWS USERS: When writing an internally generated
bit vector out to a disk file, it is important to open
the file in the binary mode as shown. Otherwise, the
bit pattern 00001010 ('\\n') in your bitstring will be
written out as 0000110100001010 ('\\r\\n'), which is
the linebreak on Windows machines.
@tag21
(21) write_bits_to_stream_object()
You can also write a bitvector directly to a stream object, as
illustrated by:
fp_write = io.StringIO()
bitvec.write_bits_to_stream_object(fp_write)
print(fp_write.getvalue())
This method does not return anything.
@tag22
(22) pad_from_left()
pad_from_right()
You can pad a bitvector from the left or from the right with a
designated number of zeros:
bitvec.pad_from_left(n)
bitvec.pad_from_right(n)
These two methods return the bitvector object on which they are
invoked. So you can think of these two calls as carrying out
in-place extensions of the bitvector (although, under the hood, the
extensions are carried out by giving new longer _vector attributes
to the bitvector objects).
@tag23
(23) if x in y:
You can test if a bit vector x is contained in another bit vector y
by using the syntax 'if x in y'. This is made possible by the
override definition for the special __contains__ method.
@tag24
(24) set_value()
You can call set_value() to change the bit pattern associated with
a previously constructed bitvector object:
bv = BitVector(intVal = 7, size =16)
print(bv) # 0000000000000111
bv.set_value(intVal = 45)
print(bv) # 101101
You can think of this method as carrying out an in-place resetting
of the bit array in a bitvector. The method does not return
anything. The allowable modes for changing the internally stored
bit pattern for a bitvector are the same as for the constructor.
@tag25
(25) count_bits()
You can count the number of bits set in a BitVector instance by
bv = BitVector(bitstring = '100111')
print(bv.count_bits()) # 4
A call to count_bits() returns an integer value that is equal to
the number of bits set in the bitvector.
@tag26
(26) count_bits_sparse()
For folks who use bit vectors with millions of bits in them but
with only a few bits set, your bit counting will go much, much
faster if you call count_bits_sparse() instead of count_bits():
However, for dense bitvectors, I expect count_bits() to work
faster.
# a BitVector with 2 million bits:
bv = BitVector(size = 2000000)
bv[345234] = 1
bv[233]=1
bv[243]=1
bv[18]=1
bv[785] =1
print(bv.count_bits_sparse()) # 5
A call to count_bits_sparse() returns an integer whose value is the
number of bits set in the bitvector.
@tag27
(27) jaccard_similarity()
jaccard_distance()
hamming_distance()
You can calculate the similarity and the distance between two
bitvectors using the Jaccard similarity coefficient and the Jaccard
distance. Also, you can calculate the Hamming distance between two
bitvectors:
bv1 = BitVector(bitstring = '11111111')
bv2 = BitVector(bitstring = '00101011')
print bv1.jaccard_similarity(bv2) # 0.675
print(str(bv1.jaccard_distance(bv2))) # 0.375
print(str(bv1.hamming_distance(bv2))) # 4
For both jaccard_distance() and jaccard_similarity(), the value
returned is a floating point number between 0 and 1. The method
hamming_distance() returns a number that is equal to the number of
bit positions in which the two operand bitvectors disagree.
@tag28
(28) next_set_bit()
Starting from a given bit position, you can find the position index
of the next set bit by
bv = BitVector(bitstring = '00000000000001')
print(bv.next_set_bit(5)) # 13
In this example, we are asking next_set_bit() to return the index
of the bit that is set after the bit position that is indexed 5. If
no next set bit is found, the method returns -1. A call to
next_set_bit() always returns a number.
@tag29
(29) rank_of_bit_set_at_index()
You can measure the "rank" of a bit that is set at a given
position. Rank is the number of bits that are set up to the
position of the bit you are interested in.
bv = BitVector(bitstring = '01010101011100')
print(bv.rank_of_bit_set_at_index(10)) # 6
The value 6 returned by this call to rank_of_bit_set_at_index() is
the number of bits set up to the position indexed 10 (including
that position). This method throws an exception if there is no bit
set at the argument position. Otherwise, it returns the rank as a
number.
@tag30
(30) is_power_of_2()
is_power_of_2_sparse()
You can test whether the integer value of a bit vector is a power
of two. The sparse version of this method will work much faster
for very long bit vectors. However, the regular version may work
faster for dense bit vectors.
bv = BitVector(bitstring = '10000000001110')
print(bv.is_power_of_2())
print(bv.is_power_of_2_sparse())
Both these predicates return 1 for true and 0 for false.
@tag31
(31) reverse()
Given a bit vector, you can construct a bit vector with all the
bits reversed, in the sense that what was left to right before now
becomes right to left.
bv = BitVector(bitstring = '0001100000000000001')
print(str(bv.reverse()))
A call to reverse() returns a new bitvector object whose bits are
in reverse order in relation to the bits in the bitvector on which
the method is invoked.
@tag32
(32) gcd()
You can find the greatest common divisor of two bit vectors:
bv1 = BitVector(bitstring = '01100110') # int val: 102
bv2 = BitVector(bitstring = '011010') # int val: 26
bv = bv1.gcd(bv2)
print(int(bv)) # 2
The result returned by gcd() is a bitvector object.
@tag33
(33) multiplicative_inverse()
This method calculates the multiplicative inverse using normal
integer arithmetic. [For such inverses in a Galois Field GF(2^n),
use the method gf_MI().]
bv_modulus = BitVector(intVal = 32)
bv = BitVector(intVal = 17)
bv_result = bv.multiplicative_inverse( bv_modulus )
if bv_result is not None:
print(str(int(bv_result))) # 17
else: print "No multiplicative inverse in this case"
What this example says is that the multiplicative inverse of 17
modulo 32 is 17. That is because 17 times 17 modulo 32 equals 1.
When using this method, you must test the returned value for
None. If the returned value is None, that means that the number
corresponding to the bitvector on which the method is invoked does
not possess a multiplicative-inverse with respect to the modulus.
When the multiplicative inverse exists, the result returned by
calling multiplicative_inverse() is a bitvector object.
@tag34
(34) gf_MI()
To calculate the multiplicative inverse of a bit vector in the
Galois Field GF(2^n) with respect to a modulus polynomial, call
gf_MI() as follows:
modulus = BitVector(bitstring = '100011011')
n = 8
a = BitVector(bitstring = '00110011')
multi_inverse = a.gf_MI(modulus, n)
print multi_inverse # 01101100
A call to gf_MI() returns a bitvector object.
@tag35
(35) gf_multiply()
If you just want to multiply two bit patterns in GF(2):
a = BitVector(bitstring='0110001')
b = BitVector(bitstring='0110')
c = a.gf_multiply(b)
print(c) # 00010100110
As you would expect, in general, the bitvector returned by this
method is longer than the two operand bitvectors. A call to
gf_multiply() returns a bitvector object.
@tag36
(36) gf_multiply_modular()
If you want to carry out modular multiplications in the Galois
Field GF(2^n):
modulus = BitVector(bitstring='100011011') # AES modulus
n = 8
a = BitVector(bitstring='0110001')
b = BitVector(bitstring='0110')
c = a.gf_multiply_modular(b, modulus, n)
print(c) # 10100110
The call to gf_multiply_modular() returns the product of the two
bitvectors a and b modulo the bitvector modulus in GF(2^8). A call
to gf_multiply_modular() returns is a bitvector object.
@tag37
(37) gf_divide_by_modulus()
To divide a bitvector by a modulus bitvector in the Galois Field
GF(2^n):
mod = BitVector(bitstring='100011011') # AES modulus
n = 8
a = BitVector(bitstring='11100010110001')
quotient, remainder = a.gf_divide_by_modulus(mod, n)
print(quotient) # 00000000111010
print(remainder) # 10001111
What this example illustrates is dividing the bitvector a by the
modulus bitvector mod. For a more general division of one
bitvector a by another bitvector b, you would multiply a by the MI
of b, where MI stands for "multiplicative inverse" as returned by
the call to the method gf_MI(). A call to gf_divide_by_modulus()
returns two bitvectors, one for the quotient and the other for the
remainder.
@tag38
(38) runs()
You can extract from a bitvector the runs of 1's and 0's in the
vector as follows:
bv = BitVector(bitlist = (1,1, 1, 0, 0, 1))
print(str(bv.runs())) # ['111', '00', '1']
The object returned by runs() is a list of strings, with each
element of this list being a string of 1's and 0's.
@tag39
(39) gen_random_bits()
You can generate a bitvector with random bits with the bits
spanning a specified width. For example, if you wanted a random
bit vector to fully span 32 bits, you would say
bv = BitVector(intVal = 0)
bv = bv.gen_random_bits(32)
print(bv) # 11011010001111011010011111000101
As you would expect, gen_random_bits() returns a bitvector object.
@tag40
(40) test_for_primality()
You can test whether a randomly generated bit vector is a prime
number using the probabilistic Miller-Rabin test
bv = BitVector(intVal = 0)
bv = bv.gen_random_bits(32)
check = bv.test_for_primality()
print(check)
The test_for_primality() methods returns a floating point number
close to 1 for prime numbers and 0 for composite numbers. The
actual value returned for a prime is the probability associated
with the determination of its primality.
@tag41
(41) get_bitvector_in_ascii()
You can call get_bitvector_in_ascii() to directly convert a bit
vector into a text string (this is a useful thing to do only if the
length of the vector is an integral multiple of 8 and every byte in
your bitvector has a print representation):
bv = BitVector(textstring = "hello")
print(bv) # 0110100001100101011011000110110001101111
mytext = bv3.get_bitvector_in_ascii()
print mytext # hello
This method is useful when you encrypt text through its bitvector
representation. After decryption, you can recover the text using
the call shown here. A call to get_bitvector_in_ascii() returns a
string.
@tag42
(42) get_bitvector_in_hex()
You can directly convert a bit vector into a hex string (this is a
useful thing to do only if the length of the vector is an integral
multiple of 4):
bv4 = BitVector(hexstring = "68656c6c6f")
print(bv4) # 0110100001100101011011000110110001101111
myhexstring = bv4.get_bitvector_in_hex()
print myhexstring # 68656c6c6
This method throws an exception if the size of the bitvector is not
a multiple of 4. The method returns a string.
@tag43
(43) close_file_object()
When you construct bitvectors by block scanning a disk file, after
you are done, you can call this method to close the file object
that was created to read the file:
bv = BitVector(filename = 'somefile')
bv1 = bv.read_bits_from_file(64)
bv.close_file_object()
The constructor call in the first statement creates a file object
for reading the bits. It is this file object that is closed when
you call close_file_object().
@tag44
(44) min_canonical()
This methods returns the canonical form of a bitvector, which
corresponds to a circularly rotated version of the bitvector with the
largest number of leading zeros.
bv = BitVector(intVal = 5678, size = 14)
min_bv = bv.min_canonical()
print(bv) # 01011000101110
print(min_bv) # 00010111001011
print(int(min_bv)) # 1483
@title
CHANGE HISTORY:
Version 3.4.4
This version fixes the behavior of the module for the edge case of an
empty BitVector instance. (An empty BitVector has no bits at all.)
Previously, invoking the count_bits() and runs() methods on an empty
BitVector instance produced results that were inconsistent with those
from regular instances.
Version 3.4.3
This is a quick release that fixes the problem with the relative imports
in the previous version. Python3 does not like relative imports.
Version 3.4.2
Unfortunately, the packaging of the previous version was not exporting
the module metadata. That problem has been fixed in this version.
Version 3.4.1
This version fixes two module packaging errors in the previous version.
One error related to the specification of the "packages" keyword in
setup.py and the other error related to not updating the manifest with
the HTML documentation file for the module.
Version 3.4
This version includes a bug fix and significant improvements to the
documentation. The new documentation is clearer about what is returned
by each method and, when a method throws an exception, that fact is
stated in the documentation associated with the method. The condition
that triggers the exception is also stated. The bug fix was in the
test_for_primality() method. If invoked for testing the primality of
1, it would get trapped in an infinite loop. Additionally, when
constructing a bitvector from a hex string, this version allows the hex
characters to be in either case. Previously, only lowercase hex
characters were acceptable. Finally, I have changed the names of a
couple of methods to better reflect their function. The old names
would, however, continue to work for backward compatibility.
Version 3.3.2:
This version fixes a bug in the constructor code for creating a bit
vector from a text string. The bug was triggered by character escapes
in such strings.
Version 3.3.1:
This is a minor upgrade to make the syntax of the API method
declarations more uniform. Previously, while most of the method names
used underscores to connect multiple words, some used camelcasing. Now
all use underscores. For backward compatibility, the old calls will
continue to work.
Version 3.3:
This version includes: (1) One additional constructor mode that allows
a bit vector to be constructed directly from the bytes type objects in
the memory. (2) A bugfix in the slice function for the case when the
upper and the lower bounds of the slice range are identical. (3) A
bugfix for the next_set_bit() method.
Version 3.2:
This version includes support for constructing bit vectors directly
from text strings and hex strings. This version also includes a safety
check on the sizes of the two argument bit vectors when calculating
Jaccard similarity between the two.
Version 3.1.1:
This version includes: (1) a fix to the module test code to account for
how string input is handled in the io.StringIO class in Python 2.7; (2)
some improvements to the documentation.
Version 3.1:
This version includes: (1) Correction for a documentation error; (2)
Fix for a bug in slice assignment when one or both of the slice limits
were left unspecified; (3) The non-circular bit shift methods now
return self so that they can be chained; (4) A method for testing a
bitvector for its primality; and (5) A method that uses Python's
'random.getrandbits()' to generate a bitvector that can serve as
candidate for primes whose bitfield size is specified.
Version 3.0:
This is a Python 3.x compliant version of the latest incarnation of the
BitVector module. This version should work with both Python 2.x and
Python 3.x.
Version 2.2:
Fixed a couple of bugs, the most important being in the bitvector