Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PQ: Workaround decoding error for non-UTF-8 strings #36

Merged
merged 7 commits into from
Apr 8, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ before_script:
- wget https://github.com/Photon-HDF5/phconvert/files/231343/Cy3.Cy5_diff_PIE-FRET.ptu.zip
- unzip Cy3.Cy5_diff_PIE-FRET.ptu.zip
- wget https://ndownloader.figshare.com/files/6955091 -O 161128_DM1_50pM_pH74.ptu
- wget https://www.dropbox.com/s/s8zzxcq7d2nfqe0/20161027_DM1_1nM_pH7_20MHz1.ptu.zip
- unzip 20161027_DM1_1nM_pH7_20MHz1.ptu.zip
- wget https://ndownloader.figshare.com/files/14828594 -O 20161027_DM1_1nM_pH7_20MHz1.ptu
- wget https://ndownloader.figshare.com/files/13675271 -O TestFile_2.ptu
- wget https://github.com/dwaithe/FCS_point_correlator/raw/master/focuspoint/topfluorPE_2_1_1_1.pt3
- wget https://github.com/Photon-HDF5/phconvert/files/1380341/DNA_FRET_0.5nM.pt3.zip
Expand Down
5 changes: 2 additions & 3 deletions appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,7 @@ before_test:
- ps: wget https://github.com/Photon-HDF5/phconvert/files/231343/Cy3.Cy5_diff_PIE-FRET.ptu.zip -OutFile Cy3.Cy5_diff_PIE-FRET.ptu.zip
- 7z e Cy3.Cy5_diff_PIE-FRET.ptu.zip
- ps: wget https://ndownloader.figshare.com/files/6955091 -OutFile 161128_DM1_50pM_pH74.ptu
- curl -fsSLO https://www.dropbox.com/s/s8zzxcq7d2nfqe0/20161027_DM1_1nM_pH7_20MHz1.ptu.zip
- 7z e 20161027_DM1_1nM_pH7_20MHz1.ptu.zip
- ps: wget https://ndownloader.figshare.com/files/14828594 -O 20161027_DM1_1nM_pH7_20MHz1.ptu
- ps: wget https://ndownloader.figshare.com/files/13675271 -OutFile TestFile_2.ptu
- ps: wget https://github.com/dwaithe/FCS_point_correlator/raw/master/focuspoint/topfluorPE_2_1_1_1.pt3 -OutFile topfluorPE_2_1_1_1.pt3
- ps: wget https://github.com/Photon-HDF5/phconvert/files/1380341/DNA_FRET_0.5nM.pt3.zip -OutFile DNA_FRET_0.5nM.pt3.zip
Expand All @@ -64,7 +63,7 @@ after_test:
- cd %APPVEYOR_BUILD_FOLDER%
- python setup.py bdist_wheel
- echo %PATH%
- deactivate
- conda deactivate
- path
- where python
- where git
Expand Down
2 changes: 2 additions & 0 deletions conda.recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ requirements:
- numpy >=1.9
- hdf5
- pytables
- mock
- pbr

run:
- python
Expand Down
8 changes: 7 additions & 1 deletion phconvert/pqreader.py
Original file line number Diff line number Diff line change
Expand Up @@ -659,7 +659,13 @@ def _ptu_read_tag(s, offset, tag_type_r):

# Some tag types have additional data
if tag['type'] == 'tyAnsiString':
tag['data'] = s[offset: offset + tag['value']].rstrip(b'\0').decode()
byte_string = s[offset: offset + tag['value']].rstrip(b'\0')
try:
tag['data'] = byte_string.decode() # try decoding from UTF-8
except UnicodeDecodeError:
# Not UTF-8, trying 'latin1'
# See https://github.com/Photon-HDF5/phconvert/issues/35
tag['data'] = byte_string.decode('latin1')
offset += tag['value']
elif tag['type'] == 'tyFloat8Array':
tag['data'] = np.frombuffer(s, dtype='float', count=tag['value'] / 8)
Expand Down
3 changes: 1 addition & 2 deletions phconvert/smreader.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,10 +120,9 @@ def load_sm(fname, return_labels=False):
sm_dtype = np.dtype([('timestamp', '>i8'), ('detector', '>u4')])

# View of the binary data as an array (no copy performed)
data = np.frombuffer(rawdata[:valid_size], dtype=sm_dtype)
data = np.frombuffer(rawdata[:valid_size], dtype=sm_dtype).copy()

# Swap byte order inplace to little endian
data.setflags(write=True)
data = data.byteswap(True).newbyteorder()

if return_labels:
Expand Down
2 changes: 1 addition & 1 deletion phconvert/test_bhreader.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def test_import_SPC_150_nanotime(self):
'test_files/test_noise.asc')

data = bhreader.load_spc(input_file, 'SPC-150')
check = pd.read_table(check_file, delimiter=' ', dtype='int64',
check = pd.read_csv(check_file, delimiter=' ', dtype='int64',
usecols=[0, 1], header=None).values.T # Way faster than numpy

# Same number of photons in both files
Expand Down