I am a bit desperate about this problem... I have no idea how to face it.
Here is a simpler way to look at this problem:
If my insert cql query is:
"BEGIN BATCH USING CONSISTENCY ONE insert into my_table(id,'2014-04-11 8:00:00',...,'2014-04-15 10:00:00') values ('2036548',3.15,...,4.11) APPLY BATCH"
...and my data request cql query is:
"Select FIRST 100000 '2014-04-01 0:00:00'..'2014-04-16 0:00:00' from my_table where id=2036548"
...why does the inserted date 2014-04-15 10:00:00 changes to 2014-04-15 11:00:00 when pullling it from Cassandra?
The date pulling code in vb.net is:
Public Shared Function getCassandraDate(ByVal value As Byte()) As Date
Dim buffer As Byte() = New Byte(value.Length - 1) {}
value.CopyTo(buffer, 0)
Array.Reverse(buffer)
Dim ticks As Long = BitConverter.ToInt64(buffer, 0)
Dim dateTime As New System.DateTime(1970, 1, 1, 0, 0, 0, _
0)
dateTime = dateTime.AddMilliseconds(ticks)
Return dateTime.ToLocalTime
End Function
...same thing in PHP:
date_default_timezone_set("Europe/Paris");
$time = $this->unpackDate($packed_time);
$str_time = date('Y-m-d H:i:s',$time); //TODO : to local time
private function unpackDate($data, $is_name=null)
{
$arr = unpack('N2', $data);
// If we are on a 32bit architecture we have to explicitly deal with
// 64-bit twos-complement arithmetic since PHP wants to treat all ints
// as signed and any int over 2^31 - 1 as a float
if (PHP_INT_SIZE == 4) {
$hi = $arr[1];
$lo = $arr[2];
$isNeg = $hi < 0;
// Check for a negative
if ($isNeg) {
$hi = ~$hi & (int)0xffffffff;
$lo = ~$lo & (int)0xffffffff;
if ($lo == (int)0xffffffff) {
$hi++;
$lo = 0;
} else {
$lo++;
}
}
// Force 32bit words in excess of 2G to pe positive - we deal wigh sign
// explicitly below
if ($hi & (int)0x80000000) {
$hi &= (int)0x7fffffff;
$hi += 0x80000000;
}
if ($lo & (int)0x80000000) {
$lo &= (int)0x7fffffff;
$lo += 0x80000000;
}
$value = $hi * 4294967296 + $lo;
if ($isNeg)
$value = 0 - $value;
} else {
// Upcast negatives in LSB bit
if ($arr[2] & 0x80000000)
$arr[2] = $arr[2] & 0xffffffff;
// Check for a negative
if ($arr[1] & 0x80000000) {
$arr[1] = $arr[1] & 0xffffffff;
$arr[1] = $arr[1] ^ 0xffffffff;
$arr[2] = $arr[2] ^ 0xffffffff;
$value = 0 - $arr[1]*4294967296 - $arr[2] - 1;
} else {
$value = $arr[1]*4294967296 + $arr[2];
}
}
return $value / 1e3;
}
MORE DETAILS
Processing chain:
(1). insertion to Cassandra through .NET
(2). Cassandra data storage
(3). Pulling the data from PHP or .NET
Problem:
As for today, a date being 2014-04-15 10:00:00 in step (1), will come out as 2014-04-15 11:00:00 in step (3).
Details:
(regarding the date format in this chain)
(1). Local time in .NET (Timezone: "Europe/Paris"). Insertion cql that is being executed: "BEGIN BATCH USING CONSISTENCY ONE insert into my_table(id,'2014-04-11 8:00:00',...,'2014-04-15 10:00:00') values ('2036548',3.15,...,4.11) APPLY BATCH"
(2). ??? I don't know what Cassandra does here... ???
(3). Example of cql query to pull the data: "Select FIRST 100000 '2014-04-01 0:00:00'..'2014-04-16 0:00:00' from my_table where id=2036548"
. In php: date_default_timezone_set("Europe/Paris"); $str_time = date('Y-m-d H:i:s',$time);
. In .NET: dateTime.ToLocalTime
.
Extra info:
I think it worked well before the daylight saving time change some weeks ago. But I can not be sure about that.
If in step (1), if I changed the date to de date to UTC before inserting it, 2014-04-15 10:00:00 will become 2014-04-15 08:00:00 and the output will be 2014-04-15 09:00:00, which is still not correct.
I highly suspect that the trick here is between steps (1) and (2), that is to say, me not being able to understand how Cassandra treats dates.
Edit1:
@Ananth 's questions:
both cassandra and client run in the same datacenter?
It is complicated:
- Insertion in .NET from server1, a different server from server-cassandra (datacenter).
- PHP (to pull the data) running on server-cassandra.
- .NET (to pull the data) running on server1, not on server-cassandra.
- PHP and .NET pulling the same result.
Can you post your schema here?
Here it is
CREATE TABLE tsmeasures (
id int PRIMARY KEY
) WITH
comment='' AND
comparator=timestamp AND
read_repair_chance=0.100000 AND
gc_grace_seconds=0 AND
default_validation=double AND
min_compaction_threshold=4 AND
max_compaction_threshold=32 AND
replicate_on_write='true' AND
compaction_strategy_class='SizeTieredCompactionStrategy' AND
compression_parameters:sstable_compression='SnappyCompressor';
Edit2:
After testing it step by step, this is the result:
- real date : 2014-04-15 17:00:00 (localtime)
- cql text : '2014-04-15 15:00:00' (to UTC, done through .NET)
- PHP Cassandra Unpack of this date => $ticks = 1397577600 (*) The unpack is done with the piece of code shown before
Ticks converted (through http://www.epochconverter.com/ )
- GMT: Tue,
- 15 Apr 2014 16:00:00 GMT Your time zone: 4/15/2014 6:00:00 PM GMT+2
These results makes no sense to me...
More details:
cql insert:
"BEGIN BATCH USING CONSISTENCY ONE insert into tsmeasures(id,'2014-04-11 15:00:00',...,'2014-04-15 15:00:00') values ('2036548',0,...,4.85) APPLY BATCH"
cql fetch:
"SELECT '2014-04-10 16:00:00'..'2014-04-20 17:00:00' FROM tsmeasures WHERE id IN (2036548,2036479,2036174,650877)"
Thus '2014-04-15 15:00:00' is included in the range of the fetch, and I can identify it because it is the highest value.
I will keep digging...