String type.
This type can be used to store any arbitrary byte sequence of variable length.
Since the size of a tensor is fixed, creating a tensor of this type requires to provide all of its values initially, so TensorFlow can compute and allocate the right amount of memory. Then the data in the tensor is initialized once and cannot be modified afterwards.
Public Methods
abstract NdArray<byte[]> |
asBytes()
|
abstract static TString |
scalarOf(String value)
Allocates a new tensor for storing a string scalar.
|
abstract static TString |
tensorOf(Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
|
abstract static TString | |
abstract static TString |
tensorOf(Charset charset, Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
|
abstract static TString | |
abstract static TString |
tensorOfBytes(Shape shape, DataBuffer<byte[]> data)
Allocates a new tensor with the given shape and raw bytes.
|
abstract static TString |
tensorOfBytes(NdArray<byte[]> src)
Allocates a new tensor which is a copy of a given array of raw bytes.
|
abstract TString |
using(Charset charset)
Use a specific charset for decoding data from a string tensor, instead of the default UTF-8.
|
abstract static TString |
vectorOf(String... values)
Allocates a new tensor for storing a vector of strings.
|
Inherited Methods
Public Methods
public abstract NdArray<byte[]> asBytes ()
Returns
- the tensor data as a n-dimensional array of raw byte sequences.
public static abstract TString scalarOf (String value)
Allocates a new tensor for storing a string scalar.
The string is encoded into bytes using the UTF-8 charset.
Parameters
value | scalar value to store in the new tensor |
---|
Returns
- the new tensor
public static abstract TString tensorOf (Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the UTF-8 charset.
Parameters
shape | shape of the tensor |
---|---|
data | buffer of strings to initialize the tensor with |
Returns
- the new tensor
public static abstract TString tensorOf (NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the UTF-8 charset.
Parameters
src | the source array giving the shape and data to the new tensor |
---|
Returns
- the new tensor
public static abstract TString tensorOf (Charset charset, Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the charset passed in parameter.
If charset is different than default UTF-8, then it must also be provided explicitly when
reading data from the tensor, using using(Charset)
:
// Given `originalStrings` an initialized buffer of strings
TString tensor =
TString.tensorOf(Charsets.UTF_16, Shape.of(originalString.size()), originalStrings);
...
TString tensorStrings = tensor.data().using(Charsets.UTF_16);
assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
Parameters
charset | charset to use for encoding the strings into bytes |
---|---|
shape | shape of the tensor |
data | buffer of strings to initialize the tensor with |
Returns
- the new tensor
public static abstract TString tensorOf (Charset charset, NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the charset passed in parameter.
If charset is different than default UTF-8, then it must also be provided explicitly when
reading data from the tensor, using using(Charset)
:
// Given `originalStrings` an initialized vector of strings
TString tensor = TString.tensorOf(Charsets.UTF_16, originalStrings);
...
TString tensorStrings = tensor.data().using(Charsets.UTF_16);
assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
Parameters
charset | charset to use for encoding the strings into bytes |
---|---|
src | the source array giving the shape and data to the new tensor |
Returns
- the new tensor
public static abstract TString tensorOfBytes (Shape shape, DataBuffer<byte[]> data)
Allocates a new tensor with the given shape and raw bytes.
The data will be copied from the provided buffer to the tensor after it has been allocated.
If data must be read as raw bytes as well, the user must specify it explicitly by invoking
asBytes()
on the returned data:
byte[] bytes = tensor.data().asBytes().getObject(0); // returns first sequence of bytes in the tensor
Parameters
shape | shape of the tensor to create |
---|---|
data | the source array giving the shape and data to the new tensor |
Returns
- the new tensor
public static abstract TString tensorOfBytes (NdArray<byte[]> src)
Allocates a new tensor which is a copy of a given array of raw bytes.
The tensor will have the same shape as the source array and its data will be copied.
If data must be read as raw bytes as well, the user must specify it explicitly by invoking
asBytes()
on the returned data:
byte[] bytes = tensor.data().asBytes().getObject(0); // returns first sequence of bytes in the tensor
Parameters
src | the source array giving the shape and data to the new tensor |
---|
Returns
- the new tensor
public abstract TString using (Charset charset)
Use a specific charset for decoding data from a string tensor, instead of the default UTF-8.
The charset must match the one used for encoding the string values when the tensor was created. For example:
TString tensor =
TString.tensorOf(StandardCharsets.UTF_16, NdArrays.scalarOfObject("TensorFlow");
assertEquals("TensorFlow", tensor.data().using(StandardCharsets.UTF_16).getObject());
Parameters
charset | charset to use |
---|
Returns
- string tensor data using this charset
public static abstract TString vectorOf (String... values)
Allocates a new tensor for storing a vector of strings.
The strings are encoded into bytes using the UTF-8 charset.
Parameters
values | values to store in the new tensor |
---|
Returns
- the new tensor