|
Hello,
I'm encountering an issue when running the daos cont create command on my DAOS setup. The command fails with a "Transport layer mercury error." Below are the details of the error and my setup:
Command and Error Message:
[root@client2 ~]# daos cont create tank --label mycont
external ERR # [5323.920594] mercury->msg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/na/na_ofi.c:3047
# na_ofi_msg_send(): fi_tsend() failed, rc: -2 (No such file or directory)
external ERR # [5323.921055] mercury->hg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/mercury_core.c:2727
# hg_core_forward_na(): Could not post send for input buffer (NA_NOENTRY)
hg ERR src/cart/crt_hg.c:1104 crt_hg_req_send_cb(0x2a5c8c0) [opc=0x1020004 (DAOS) rpcid=0x18ea69b600000000 rank:tag=0 ] RPC failed; rc: DER_HG(-1020): 'Transport layer mercury error'
mgmt ERR src/mgmt/cli_mgmt.c:882 dc_mgmt_pool_find() tank: failed to get PS replicas from 1 servers, DER_HG(-1020): 'Transport layer mercury error'
pool ERR src/pool/cli.c:198 dc_pool_choose_svc_rank() 00000000:tank: dc_mgmt_pool_find() failed, DER_HG(-1020): 'Transport layer mercury error'
pool ERR src/pool/cli.c:503 dc_pool_connect_internal() 00000000:tank: cannot find pool service: DER_HG(-1020): 'Transport layer mercury error'
ERROR: daos: DER_HG(-1020): Transport layer mercury error
Environment Details:
DAOS Version: daos-2.0.3-5.el7.x86_64
DAOS Client Version: daos-client-2.0.3-5.el7.x86_64
Libfabric Version: libfabric-1.15.1-1.el7.x86_64
Mercury Version: mercury-2.1.0~rc4-9.el7.x86_64
CentOS Version: CentOS 7.9
Fabric Interface: enp0s3
Additional Information:
[root@server ~]# ip addr
1: lo: <loopback,up,lower_up> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <broadcast,multicast,up,lower_up> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:bd:95:c2 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic enp0s3
valid_lft 564sec preferred_lft 564sec
inet6 fe80::e25:a2fd:9904:a8ac/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <broadcast,multicast,up,lower_up> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:bb:cb:4d brd ff:ff:ff:ff:ff:ff
inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute dynamic enp0s8
valid_lft 370sec preferred_lft 370sec
inet6 fe80::8d85:5b39:5f73:6e0b/64 scope link noprefixroute
valid_lft forever preferred_lft forever
I have also mentioned the DAOS server, client, and agent configuration files for reference.
DAOS Server
## default: daos_server
name: daos_server
#
#
## Access points
## Immutable after running "dmg storage format".
#
## To operate, DAOS will need a quorum of access point nodes to be available.
## Must have the same value for all agents and servers in a system.
## Hosts can be specified with or without port. The default port that is set
## up in port: will be used if a port is not specified here.
#
## default: hostname of this node
access_points: ['10.0.2.15']
#
#
## Default control plane port
#
## Port number to bind daos_server to. This will also be used when connecting
## to access points, unless a port is specified in access_points:
#
## default: 10001
port: 10001
#
#
## Transport credentials specifying certificates to secure communications
#
transport_config:
# # In order to disable transport security, uncomment and set allow_insecure
# # to true. Not recommended for production configurations.
allow_insecure: true
#
# # Location where daos_server will look for Client certificates
client_cert_dir: /etc/daos/certs/clients
# # Custom CA Root certificate for generated certs
ca_cert: /etc/daos/certs/daosCA.crt
# # Server certificate for use in TLS handshakes
cert: /etc/daos/certs/server.crt
# # Key portion of Server Certificate
key: /etc/daos/certs/server.key
provider: ofi+sockets
socket_dir: /var/run/daos_server
nr_hugepages: 4096
control_log_mask: DEBUG
control_log_file: /tmp/daos_server.log
helper_log_file: /tmp/daos_admin.log
engines:
-
targets: 8
nr_xs_helpers: 0
fabric_iface: enp0s3
fabric_iface_port: 31316
log_mask: INFO
log_file: /tmp/daos_engine_0.log
env_vars:
- CRT_TIMEOUT=30
scm_mount: /mnt/daos0
scm_class: ram
scm_size: 8
DAOS Control file
# default: daos_server
name: daos_server
# Default destination port to use when connecting to hosts in the hostlist.
# default: 10001
port: 10001
# Hostlist, a comma separated list of addresses (hostnames or IPv4 addresses).
# default: ['localhost']
hostlist: ['10.0.2.15']
## Transport Credentials Specifying certificates to secure communications
transport_config:
# # In order to disable transport security, uncomment and set allow_insecure
# # to true. Not recommended for production configurations.
allow_insecure: true
#
# # Custom CA Root certificate for generated certs
ca_cert: /etc/daos/certs/daosCA.crt
# # Admin certificate for use in TLS handshakes
cert: /etc/daos/certs/admin.crt
# # Key portion of Admin Certificate
key: /etc/daos/certs/admin.key
DAOS Agent file
# default: daos_server
name: daos_server
# Management server access points
# Must have the same value for all agents and servers in a system.
# default: hostname of this node
access_points: ['10.0.2.15']
# Force different port number to connect to access points.
# default: 10001
port: 10001
## Transport Credentials Specifying certificates to secure communications
#
transport_config:
# # In order to disable transport security, uncomment and set allow_insecure
# # to true. Not recommended for production configurations.
allow_insecure: true
#
# # Custom CA Root certificate for generated certs
ca_cert: /etc/daos/certs/daosCA.crt
# # Agent certificate for use in TLS handshakes
cert: /etc/daos/certs/agent.crt
# # Key portion of Agent Certificate
key: /etc/daos/certs/agent.key
# Use the given directory for creating unix domain sockets
#
# NOTE: Do not change this when running under systemd control. If it needs to
# be changed, then make sure that it matches the RuntimeDirectory setting
# in /usr/lib/systemd/system/daos_agent.service
#
# default: /var/run/daos_agent
#runtime_dir: /var/run/daos_agent
# Full path and name of the DAOS agent logfile.
# default: /tmp/daos_agent.log
log_file: /tmp/daos_agent.log
# Manually define the fabric interfaces and domains to be used by the agent,
# organized by NUMA node.
# If not defined, the agent will automatically detect all fabric interfaces and
# select appropriate ones based on the server preferences.
#
#fabric_ifaces:
#-
# numa_node: 0
# devices:
# -
# iface: ib0
# domain: mlx5_0
# -
# iface: ib1
# domain: mlx5_1
#-
# numa_node: 1
# devices:
# -
# iface: ib2
# domain: mlx5_2
# -
# iface: ib3
# domain: mlx5_3
Any assistance or insights into resolving this issue would be greatly appreciated. Thank you!
|
|
|
|
|
I don't know anything about DAOS so I can't comment on that. Have you had this working in the past and it stopped working, or are you trying to get it running now? Either way, CentOS 7 was initially released in 2014, and goes EOL at the end of the month. What you might be seeing is that C7 doesn't support the infrastructure needed for DAOS.
This would probably be better answered in a forum dedicated to DAOS.
In any case, given the short support lifetime of C7, I'd recommend you consider moving to something more recent, like CentOS 9 or one of the other RHEL 9 based distros. I hear good things about Rocky Linux. Or move to Ubuntu or Debian.
"A little song, a little dance, a little seltzer down your pants"
Chuckles the clown
|
|
|
|
|
I am looking for help to implement / test rfcomm - on Ubuntu Description: Ubuntu 22.04.4 LTS
When I type
rfcomm -a
I get this response
nov25-1@nov251-desktop:~$ rfcomm --a
nov25-1@nov251-desktop:~$
When I type
rfcomm -h
I get this
nov25-1@nov251-desktop:~$ rfcomm --h
RFCOMM configuration utility ver 5.64
Usage:
rfcomm [options] <command> <dev>
Options:
-i, --device [hciX|bdaddr] Local HCI device or BD Address
-h, --help Display help
-r, --raw Switch TTY into raw mode
-A, --auth Enable authentication
-E, --encrypt Enable encryption
-S, --secure Secure connection
-C, --central Become the central of a piconet
-L, --linger [seconds] Set linger timeout
-a Show all devices (default)
Commands:
bind <dev> <bdaddr> [channel] Bind device
release <dev> Release device
show <dev> Show device
connect <dev> <bdaddr> [channel] Connect device
listen <dev> [channel [cmd]] Listen
watch <dev> [channel [cmd]] Watch
nov25-1@nov251-desktop:~$
I am using this link
Ubuntu Manpage: rfcomm - RFCOMM configuration utility[^]
Any help would be greatly appreciated
Thanks
|
|
|
|
|
Salvatore Terress wrote: rfcomm --a
That has two dashes.
|
|
|
|
|
Make no differece - two -- or one - and
what does
"default" do ?
should then
rfcomm == rfcomm -a
|
|
|
|
|
Which devices have you configured?
If you haven't configured any, what would you expect when asking for a list of configured devices?
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
I am not blaming on anybody, but on MYSELF
there is no English command in rfcomm man
to "configure"
however there is "bind" which to me means something likes " to associate (with) "
so one has to guess to at least select "device" and optionally more...
actually to go back to Unix - "no response / no reply is SUCCESS"
so when
"rfcomm" - used as default, no options specified, and returns NOTHING is really telling NOTHING
Lesson learn - case closed
|
|
|
|
|
Are you sure that you know enough about Bluetooth to handle it?
I mean, treating "bind" as some new, unknown concept suggests that you do not know the first thing about Bluetooth. You will not be able to configure a device that you you don't have the slightest clue about.
First thing in configuring a Bluetooth device is to learn what Bluetooth is, and its architecture.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
Well according to rfcomm you do not have any devices configured.
|
|
|
|
|
Yeah they over-abbreviated the command. Should have spelled it rt fm comm.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
<a href="https://man7.org/linux/man-pages/man3/system.3.html">system(3) - Linux manual page</a>[<a href="https://man7.org/linux/man-pages/man3/system.3.html" target="_blank" title="New Window">^</a>]
This is somewhat a dupe, so if it bothers you...( I am not allowed to give instructions ...)
I am asking for generic / general help in capturing Linux command output.
I am going to use this as an example:
system("hcitool dev | cut -sf3 ");
I know what is does - that is NOT of my concern..
I can see the result in (Qt) console...
I do not know HOW - what "hidden" code (std??) put
the actual result of the call onto console
and
how can I redirect such call for my purpose ?
I did RTFM for "system" and found NO references , or missed them,
on how to capture THE actual output of the command.
that said - what would be SUCCESSFUL phrase to ask Mrs Google to
give me a general Linux resource ( text book ?) to cover
"standard output / input / error "
and
how to figure out if the command uses
"standard output / input / error "
and how to access it in C/C++
..all this just for learning purpose
CHEERS
THANks
|
|
|
|
|
|
I have reached another dead end in my Bluetooth programming...
without getting into details - task works fine when performed in Linux terminal
, but fails to connect when run in C++ code.
After RTFM I found
"the error source" in "bluez " library, burred in few layers of undocumented code ...
So I am going back to "level" where I can control the
usage of "bluez" library...
That means to go back to
https://people.csail.mit.edu/albert/bluez-intro/c404.html[^]
My question
to the forum -
is there a C++ version of the above?
( I asked Mrs Google...)
Yes, the foundation would still be "bluez" library...but if I can
manage to convert the above to C++ it my be "some " progress.
Thanks very much for any constructive help.
|
|
|
|
|
|
Yes, exactly what I was looking for. Thanks.
Actually it has "d-bus" code I need to try to "up code" ,
try a differ approach to Bluetooth coding.
|
|
|
|
|
I have done the individual analysis on my data , now I m trying to run 3dDeconvolve code but it does not work at all. I dont know what is wrong.
|
|
|
|
|
Sevinc Bayar wrote: I dont know what is wrong. It is unlikely that anyone here could guess.
|
|
|
|
|
Nobody is going to be able to help you because you haven't shown the code you've written that has the problem nor even described a problem! Just showing up to a forum and saying "it doesn't work" is a complete waste of your time.
|
|
|
|
|
Also not sure why it is a Linux (this forum problem.)
If anyone else is curious about what it is even about. From National Institutes of Health (pdf).
"Program 3dDeconvolve was developed to provide deconvolution analysis of FMRI time series data."
Found with following google search
what is 3dDeconvolve
|
|
|
|
|
Offhand, I'd guess ther's an error in the code or the keyboard
CQ de W5ALT
Walt Fair, Jr.PhD P. E.
Comport Computing
Specializing in Technical Engineering Software
|
|
|
|
|
I am using hcitool to identify local , attached to my PC via USB
Bluetooth adapters.
The actual command is "hcitool dev".
Most of the time I receive hci0 plus address.
Recently I started getting hci1.
If I attach two adapters
I receive
hci0
hcix
where hcix keeps increasing ( the x ) with each, same , hcitool dev command.
I am looking for way to keep the hci0 as hci0.
Thanks very much for your help, it is appreciated very much
|
|
|
|
|
You might want to research your assumptions.
You are assuming that hcitool controls that. Versus just reporting on what is there.
If there is a way to return other attributes you might be able to identify it that way.
I have not worked with this protocol but I have worked with many others so I suggest that you do not assume that your testing by itself will return all possibilities. A general solution might either be coded only to support limited scenarios (those tested) or must provide a way to monitor/report when it is not found, what is found, and potentially a way to configure/add more info to a running app.
|
|
|
|
|
Simple , but Mrs Google gives " where space is "....and not "whereis"....
|
|
|
|
|
|
You can put double quotes around word/phrase in google to force it.
Salvatore Terress wrote: "whereis" recursively
I doubt that makes any sense. Whereis already recurses via something (execution path?).
Googling shows options for whereis that allows one to limit the search but not expand it.
|
|
|
|