Dealing with Common::Object& parameters in callbacks (C++)

jString.toString().cstr() and jString.cstr() yield different results.
Common::JString test_str("no quotes");
UE_LOG(HGLOG, Warning, TEXT("%s %s"), test_str.toString().cstr(), test_str.cstr());
yields:
HGLOG: Warning: "no quotes" no quotes

I'm currently trying to implement the callback:
void ChatListener::onPrivateMessage(const Common::JString& sender, const Common::Object& message, const Common::JString& channelName)

for Photon chat where parameter message is a more general Common::Object type. I figured to avoid a cast (since I wish to only deal with string types) I would just do a toString().cstr() to convert to another native string format (FString in UE4). I ran into an issue where the incoming message was a json payload, but had parsing errors due to quotes being added to the string in the toString().cstr() method. I can certainly cast to a JString to avoid the quotes, but am I going about this the wrong way? If I only ever send JString objects from the client is it safe to assume the handler methods will only ever receive JStrings as the message parameter, therefore a cast is fine?

Sidenote: I can't seem to find any documentation/examples on defining 'complex types' for photon chat messages. Anywhere I could find these as that might help with my current implementation?

Comments

  • Kaiserludi
    Kaiserludi admin
    edited September 2017
    Hi @onealexleft.

    jString.toString().cstr() and jString.cstr() yield different results.

    Common::JString test_str("no quotes");
    UE_LOG(HGLOG, Warning, TEXT("%s %s"), test_str.toString().cstr(), test_str.cstr());

    yields:
    HGLOG: Warning: "no quotes" no quotes

    That is intended.

    toString() "stringifies" the Object on which it is called on - aka it gives you a JString representation of that object. When the object on which it is called already is a JString, then toString() add "" around it, as otherwise for a string like "1" it would not be clear if it already was a string before stringification.
    Calling toString() directly on a JString is rather pointless. The reason that is exists are complex container objects like Hashtables, Dictionaries or JVectors, which may contain JStrings among there elements as well as elements of other types. Stringifying such constructs can be very useful for logging and debugging purposes to get your hands on a human-readable representation of the structure of that object so that you analyze if it actually contains what you expect it to contain.

    You can have nested Dictionaries and/or Hashtables as many levels deep as make sense to your usecase, you can have multi-dimensional arrays of them, you can add custom classes as values to them which may be as complex as makes sense, you can have object-arrays of mixed types and you can combine all of that.
    With such complex structures it may not always be obvious to the receiving code what structure the sending code has built and how to read that out again, so seeing a string representation of it can help a lot.

    Now imagine a very simple Hashtable that contains two keys and two values:
    ExitGames::Common::Hashtable hash;
    hash.put(L"key 1", L"1, 2.2f, ([{}])?");
    hash.put(9, L"#&");
    wprintf(L"%ls\n", hash.toString().cstr());
    That results in the following output:
    {"key 1"="1, 2.2f, ([{}])?", 9="#&"}

    I think that example makes it obvious why toString() adds "" around the JString instances instead of leaving it completely untouched.
    Otherwise it would be a lot harder to figure out what belongs to a string and what is a different key or value or belongs to the structuring information of the Hashtable itself.


    I figured to avoid a cast (since I wish to only deal with string types) I would just do a toString().cstr() to convert to another native string format (FString in UE4). I ran into an issue where the incoming message was a json payload, but had parsing errors due to quotes being added to the string in the toString().cstr() method. I can certainly cast to a JString to avoid the quotes, but am I going about this the wrong way? If I only ever send JString objects from the client is it safe to assume the handler methods will only ever receive JStrings as the message parameter, therefore a cast is fine?

    Calling toString() on the received Object is as wrong of a way as you can find when you actually want to access the payload of that Object.

    What you should do is converting the Object into a ValueObject<JString> (either by copy construction or by cast - the former is safer as it will check that the Object actually contains a JString on top-level, before attempting to convert it to a ValueObject<JString> - if the Object instead contains i.e. an int, then the attempt to copy-construct a ValueObject from it will just create an empty ValueObject<JString>, which just contains an empty string, while an attempt to convert an Object instance that contains a JString to a ValueObject<int> would result in an empty ValueObject<int> that just contains the default value of 0 - this often is preferable compared to a cast that would just result in junk payload when the type is not correct, however checking the result of Object.getType() yourself and then just doing a cast if it is what you expect it to be might be preferable for in cases in which the Object may contain a huge amount of data so that you want to avoid a copy for performance reasons) and then access the payload of that object by a call to getDataCopy() or getDataAddress().


    EDIT:
    Removed wrong information about conversions between FString and JString - see the posts below for correct information.



    Sidenote: I can't seem to find any documentation/examples on defining 'complex types' for photon chat messages. Anywhere I could find these as that might help with my current implementation?

    For documentation on the supported data type please refer to http://doc-api.photonengine.com/en/cpp/current/html/a05589.html ( the direct link may change in future versions, so future readers: just go to http://doc-api.photonengine.com/en/cpp/current and navigate to Common - table of data types form the main page).
    Photon Chat supports the full set of data types for messages that are in general supported by Photons protocol and described on that page.
    For that reason you can just have a look at demo_typeSupport inside the Client SDKs demos folder, for example code for complex types like multi-dimensional arrays, Object-arrays, nested Hashtables and Dictionaries and even custom types. Although demo_typeSupport uses LoadBalancing-cpp, what is shown there can also be used in code that uses Chat-cpp.
  • Hi @Kaiserludi !
    Thanks so much for the detailed explanation and clarification. I can rest easy now that I know how to deal with the api and extend it :). The code I originally wrote felt wrong on many levels, hahaha!

    In regards to string representation in UE4, I haven't been able to find anywhere that TCHAR uses char under the hood.

    Looking at the public source:
    
    // Character types.
    	typedef char				ANSICHAR;	// An ANSI character       -                  8-bit fixed-width representation of 7-bit characters.
    	typedef wchar_t				WIDECHAR;	// A wide character        - In-memory only.  ?-bit fixed-width representation of the platform's natural wide character set.  Could be different sizes on different platforms.
    	typedef uint8				CHAR8;		// An 8-bit character type - In-memory only.  8-bit representation.  Should really be char8_t but making this the generic option is easier for compilers which don't fully support C++11 yet (i.e. MSVC).
    	typedef uint16				CHAR16;		// A 16-bit character type - In-memory only.  16-bit representation.  Should really be char16_t but making this the generic option is easier for compilers which don't fully support C++11 yet (i.e. MSVC).
    	typedef uint32				CHAR32;		// A 32-bit character type - In-memory only.  32-bit representation.  Should really be char32_t but making this the generic option is easier for compilers which don't fully support C++11 yet (i.e. MSVC).
    	typedef WIDECHAR			TCHAR;		// A switchable character  - In-memory only.  Either ANSICHAR or WIDECHAR, depending on a licensee's requirements.
    
    It looks like TCHAR is wchar_t under the hood. Maybe this wasn't the case many versions ago. Also the post you link about changing FString seems to suggest the same conclusion (TCHAR is wchar_t under the hood).
    Even in situations when you pass string literals to FString() it seems to do the conversion (you can use a macro to skip this for efficiency).

    If FString indeed uses wchar_t Unicode, then is doing a jstring.ctr() still considered safe?
  • Hi @onealexleft.

    You are right.
    I remembered otherwise, but after your last post I have rechecked it myself and indeed UE4 is actually by default using the same string character width and encoding like the Photon C++ Client libs do.

    In that case you are right:
    No string conversion is needed between FString and JString at all.
    The following all works just fine without any actual conversion being involved - all those lines just copy over the characters from one buffer into the other:
    
    JString myJString = *myFString; // when constructing a JString from a FString, we need to access the wchar_t* of the  FString with operator*()
    
    
    FString myFString = myJString.cstr(); // when constructing a FString from a JString, we need to access the wchar_t* of the  JString with cstr()
    
    
    myJString = *myFString; // when assigning the value of a FString to a JString, we need to access the wchar_t* of the  FString with operator*()
    
    
    myFString = myJString; // when assigning the value of a JString to a FString, we DO NOT need to access the wchar_t* of the  JString with cstr(), as the compiler happily detects that the existence of JString::operator wchar_t*() grants it permission to automatically do that for us in this case
    
    Side note:
    The following do all work fine, but do all involve an implicit conversion from char* to wchar_t* and UTF8 to UTF16/UTF32 inside the constructors / assignment operators.
    
    JString myJString = "aString";
    
    
    FString myFString = "aString";
    
    
    myJString = "aString";
    
    
    myFString = "aString";
    
    Therefor the following should be strongly preferred, as it avoids the conversion, because the string literal already has the format that both string classes need.
    
    JString myJString = L"aString";
    
    
    FString myFString = L"aString";
    
    
    myJString = L"aString";
    
    
    myFString = L"aString";
    
    As alternative to L"" UE4 offers TEXT(), which is just a #define that prefixes its argument string with a 'L', when Unreals TCHAR is defined to wchar_t (what I strongly recommend and apparently also is their default like you have figured out), but leaves it without that prefix when TCHAR is defined as char.

    Using TEXT() makes some sense for FString, where you could theoretically switch to char, but for JString it actually is preferable to always use the L prefix (either just hard code it or use a different macro that can be defined independently of TEXT and the state of TCHAR) as it always uses wchar_t internally and switching the definition of TCHAR to char would suddenly introduce implicit conversions from UTF8 to UTF16/UTF32 for lines like myJString = TEXT("");, while myJString = L""; does always work without implicit conversions, no matter how TCHAR is defined.
  • Awesome, thanks for info!